跳到主要内容

Advanced Aggregate Functions in Cloudberry Database

The following built-in advanced analytic functions are Cloudberry extensions of the PostgreSQL database. Analytic functions are immutable.

Note The Cloudberry MADlib Extension for Analytics provides additional advanced functions to perform statistical analysis and machine learning with Cloudberry Database data.

FunctionReturn TypeFull SyntaxDescription
MEDIAN (expr)timestamp, timestamptz, interval, floatMEDIAN (expression) 1Can take a two-dimensional array as input. Treats such arrays as matrices.
PERCENTILE_CONT (expr) WITHIN GROUP (ORDER BY expr [DESC/ASC])timestamp, timestamptz, interval, floatPERCENTILE_CONT(percentage) WITHIN GROUP (ORDER BY expression) 2Performs an inverse distribution function that assumes a continuous distribution model. It takes a percentile value and a sort specification and returns the same datatype as the numeric datatype of the argument. This returned value is a computed result after performing linear interpolation. Null are ignored in this calculation.
PERCENTILE_DISC (expr) WITHIN GROUP (ORDER BY expr [DESC/ASC])timestamp, timestamptz, interval, floatPERCENTILE_DISC(percentage) WITHIN GROUP (ORDER BY expression) 3Performs an inverse distribution function that assumes a discrete distribution model. It takes a percentile value and a sort specification. This returned value is an element from the set. Null are ignored in this calculation.
sum(array[])smallint[]int[], bigint[], float[]sum(array[[1,2],[3,4]]) 4Performs matrix summation. Can take as input a two-dimensional array that is treated as a matrix.
pivot_sum (label[], label, expr)int[], bigint[], float[]pivot_sum( array['A1','A2'], attr, value)A pivot aggregation using sum to resolve duplicate entries.
unnest (array[])set of anyelementunnest( array['one', 'row', 'per', 'item'])Transforms a one-dimensional array into rows. Returns a set of anyelement, a polymorphic pseudotype in PostgreSQL.

Footnotes

  1. SELECT department_id, MEDIAN(salary) 
    FROM employees
    GROUP BY department_id;
  2. SELECT department_id,
    PERCENTILE_CONT (0.5) WITHIN GROUP (ORDER BY salary DESC)
    "Median_cont";
    FROM employees GROUP BY department_id;
  3. SELECT department_id, 
    PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY salary DESC)
    "Median_desc";
    FROM employees GROUP BY department_id;
  4. CREATE TABLE mymatrix (myvalue int[]);
    INSERT INTO mymatrix VALUES (array[[1,2],[3,4]]);
    INSERT INTO mymatrix VALUES (array[[0,1],[1,0]]);
    SELECT sum(myvalue) FROM mymatrix;
    sum
    ---------------
    {{1,3},{4,4}}