什么是最簡單的(希望不是太慢)的方式來計算與MySQL?我AVG(x)
尋找,但我有一個很難找到計算現在一個簡單的方法,我返回所有行到PHP,做一個排序,然后拿起中間行,但肯定要有簡單的做它的方式一個MySQL查詢。 示例數據:
id | val
--------
1 4
2 7
3 2
4 2
5 9
6 8
7 3
對排序val
給2 2 3 4 7 8 9
,所以應該是4
,SELECT AVG(val)
這==5
。
本文地址 :CodeGo.net/75717/
-------------------------------------------------------------------------------------------------------------------------
1. 我只是發現了另一個答案網上在 在幾乎所有的SQL:
SELECT x.val from data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*)+1)/2
請確保您的列以及索引和索引的篩選和排序。驗證與解釋計划。
select count(*) from table --find the number of rows
計算“中位數”行數。median_row = floor(count / 2)
。 然后挑選出來的名單:
select val from table order by val asc limit median_row,1
這應該回報你與你想要的值一行。 雅各
2. 與建議的解決方案(TheJacobTaylor)問題是加入表格本身是慢如糖蜜為大型數據集。我建議的替代在mysql中運行,已使用顯式的ORDER BY,這樣你就不必希望你的索引下令適當給一個正確的結果,並且容易解開的查詢來調試。
SELECT avg(t1.val) as median_val FROM (
SELECT @rownum:=@rownum+1 as `row_number`, d.val
FROM data d, (SELECT @rownum:=0) r
WHERE 1
-- put some where clause here
ORDER BY d.val
) as t1,
(
SELECT count(*) as total_rows
FROM data d
WHERE 1
-- put same where clause here
) as t2
WHERE 1
AND t1.row_number in ( floor((total_rows+1)/2), floor((total_rows+2)/2) );
[編輯] 添加AVG()周圍t1.val和ROW_NUMBER在(...)當有偶數個記錄正確產生。推理:
SELECT floor((3+1)/2),floor((3+2)/2);#total_rows is 3, so avg row_numbers 2 and 2
SELECT floor((4+1)/2),floor((4+2)/2);#total_rows is 4, so avg row_numbers 2 and 3
3. 我發現接受的解決方案並沒有對我的MySQL安裝工作,返回一個空集,但這個查詢工作中,我測試了它在所有情況:
SELECT x.val from data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val)))/COUNT(*) > .5
LIMIT 1
4. 一此頁面上的MySQL有以下建議:
-- (mostly) High Performance scaling MEDIAN function per group
-- Median defined in CodeGo.net
--
-- by Peter Hlavac
-- 06.11.2008
--
-- Example Table:
DROP table if exists table_median;
CREATE TABLE table_median (id INTEGER(11),val INTEGER(11));
COMMIT;
INSERT INTO table_median (id, val) VALUES
(1, 7), (1, 4), (1, 5), (1, 1), (1, 8), (1, 3), (1, 6),
(2, 4),
(3, 5), (3, 2),
(4, 5), (4, 12), (4, 1), (4, 7);
-- Calculating the MEDIAN
SELECT @a := 0;
SELECT
id,
AVG(val) AS MEDIAN
FROM (
SELECT
id,
val
FROM (
SELECT
-- Create an index n for every id
@a := (@a + 1) mod o.c AS shifted_n,
IF(@a mod o.c=0, o.c, @a) AS n,
o.id,
o.val,
-- the number of elements for every id
o.c
FROM (
SELECT
t_o.id,
val,
c
FROM
table_median t_o INNER JOIN
(SELECT
id,
COUNT(1) AS c
FROM
table_median
GROUP BY
id
) t2
ON (t2.id = t_o.id)
ORDER BY
t_o.id,val
) o
) a
WHERE
IF(
-- if there is an even number of elements
-- take the lower and the upper median
-- and use AVG(lower,upper)
c MOD 2 = 0,
n = c DIV 2 OR n = (c DIV 2)+1,
-- if its an odd number of elements
-- take the first if its only one element
-- or take the one in the middle
IF(
c = 1,
n = 1,
n = c DIV 2 + 1
)
)
) a
GROUP BY
id;
-- Explanation:
-- The Statement creates a helper table like
--
-- n id val count
-- ----------------
-- 1, 1, 1, 7
-- 2, 1, 3, 7
-- 3, 1, 4, 7
-- 4, 1, 5, 7
-- 5, 1, 6, 7
-- 6, 1, 7, 7
-- 7, 1, 8, 7
--
-- 1, 2, 4, 1
-- 1, 3, 2, 2
-- 2, 3, 5, 2
--
-- 1, 4, 1, 4
-- 2, 4, 5, 4
-- 3, 4, 7, 4
-- 4, 4, 12, 4
-- from there we can select the n-th element on the position: count div 2 + 1
5. 你函數 CodeGo.net,在這里找到。
6. 我提出了一個更快的方法。 獲取的行數:SELECT CEIL(COUNT(*)/2) FROM data;
然后取中間值在排序子查詢:SELECT max(val) FROM (SELECT val FROM data ORDER BY val limit @middlevalue) x;
我測試了這個隨機數的5×10e6個數據集,它會發現在10秒以內。
7. 建立銷magic貼的回答,對於那些你不必做了,是通過另一個分組 選擇grp_field,t1.val FROM( 選擇grp_field,@ROWNUM:=IF(@S=grp_field,@ROWNUM +1,0)ASrow_number
, @S:=IF(@S=grp_field,@S,grp_field)為二段,d.val 從數據D,(SELECT ROWNUM@:=0,@S:=0)R ORDER BY grp_field,d.val )為T1 JOIN( 選擇grp_field,COUNT(*)作為total_rows 從數據D GROUP BY grp_field )為T2 開t1.grp_field=t2.grp_field WHERE t1.row_number=地板(total_rows / 2)+1;
8. 不幸的是,TheJacobTaylor的也不是magic貼的答案返回准確的結果為MySQL的最新版本。 從上面magic貼的答案是接近,但它不能正確計算結果集與偶數行。中值的定義為要么1)在偶數套的中間數的奇數編號的集合,或中間的兩個數的2)的平均值。 所以,這里的補丁來處理奇數和偶數設置magic貼的解決方案:
SELECT AVG(middle_values) AS 'median' FROM (
SELECT t1.median_column AS 'middle_values' FROM
(
SELECT @row:=@row+1 as `row`, x.median_column
FROM median_table AS x, (SELECT @row:=0) AS r
WHERE 1
-- put some where clause here
ORDER BY x.median_column
) AS t1,
(
SELECT COUNT(*) as 'count'
FROM median_table x
WHERE 1
-- put same where clause here
) AS t2
-- the following condition will return 1 record for odd number sets, or 2 records for even number sets.
WHERE t1.row >= t2.count/2 and t1.row <= ((t2.count/2) +1)) AS t3;
為此,請按照下列3個簡單步驟: 與您的表上面的代碼替換“median_table”(2出現) 替換“median_column”(出現3次)與你想找到的列 如果你有一個WHERE條件,將“WHERE 1”(事件2)與你的where條件
9. 最上面的工作方案只為表的一個字段中,您可能需要獲得(第50百分位)上查詢多個領域。 這樣:
SELECT CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(
GROUP_CONCAT(field_name ORDER BY field_name SEPARATOR ','),
',', 50/100 * COUNT(*) + 1), ',', -1) AS DECIMAL) AS `Median`
FROM table_name;
您可以在例如替換“50”以上的任何百分位,是非常有效的。 只要確保你的GROUP_CONCAT,你可以改變它:
SET group_concat_max_len = 10485760; #10MB max length
更多詳細信息:
10. 需要關心的奇數值計數-給出了在這種情況下,兩個中間值的平均值。
SELECT AVG(val) FROM
( SELECT x.id, x.val from data x, data y
GROUP BY x.id, x.val
HAVING SUM(SIGN(1-SIGN(IF(y.val-x.val=0 AND x.id != y.id, SIGN(x.id-y.id), y.val-x.val)))) IN (ROUND((COUNT(*))/2), ROUND((COUNT(*)+1)/2))
) sq
11. 兩個查詢方法: 優先個獲得數,最小值,最大值和平均值 第二個(與“LIMIT@數/ 2,1”和編制“ORDER BY ..”來獲得值 這些被包裹在一個函數defn,所以所有的值可以從一個調用中返回。 如果你的范圍是靜態的,你的數據不經常變動,這可能是更有效的,而不是從頭開始查詢這些值存儲的值每
12. 答案很簡單:得到以下指標值:COUNT(*)/ 2取整。 答案:COUNT()/ 2調高或調低,這取決於你的高清或者你可以寫一個,如果為偶數的情況下和平均中間的數字(「該數()/ 2“四舍五入數字與”COUNT(*) / 2“四舍五入號)。
13. 如果MySQL有ROW_NUMBER,則中位數是(通過此SQL Server查詢得到啟發):
WITH Numbered AS
(
SELECT *, COUNT(*) OVER () AS Cnt,
ROW_NUMBER() OVER (ORDER BY val) AS RowNum
FROM yourtable
)
SELECT id, val
FROM Numbered
WHERE RowNum IN ((Cnt+1)/2, (Cnt+2)/2)
;
在該情況下,在你有偶數個條目。 如果你想每組找到,那么就PARTITION BY組中的OVER 搶
14. 我的代碼,效率不表或額外的變量:
SELECT
((SUBSTRING_INDEX(SUBSTRING_INDEX(group_concat(val order by val), ',', floor(1+((count(val)-1) / 2))), ',', -1))
+
(SUBSTRING_INDEX(SUBSTRING_INDEX(group_concat(val order by val), ',', ceiling(1+((count(val)-1) / 2))), ',', -1)))/2
as median
FROM table;
15. 或者,你也可以在存儲這樣做
DROP PROCEDURE IF EXISTS median;
DELIMITER //
CREATE PROCEDURE median (table_name VARCHAR(255), column_name VARCHAR(255), where_clause VARCHAR(255))
BEGIN
-- Set default parameters
IF where_clause IS NULL OR where_clause = '' THEN
SET where_clause = 1;
END IF;
-- Prepare statement
SET @sql = CONCAT(
"SELECT AVG(middle_values) AS 'median' FROM (
SELECT t1.", column_name, " AS 'middle_values' FROM
(
SELECT @row:=@row+1 as `row`, x.", column_name, "
FROM ", table_name," AS x, (SELECT @row:=0) AS r
WHERE ", where_clause, " ORDER BY x.", column_name, "
) AS t1,
(
SELECT COUNT(*) as 'count'
FROM ", table_name, " x
WHERE ", where_clause, "
) AS t2
-- the following condition will return 1 record for odd number sets, or 2 records for even number sets.
WHERE t1.row >= t2.count/2
AND t1.row <= ((t2.count/2)+1)) AS t3
");
-- Execute statement
PREPARE stmt FROM @sql;
EXECUTE stmt;
END//
DELIMITER ;
-- Sample usage:
-- median(table_name, column_name, where_condition);
CALL median('products', 'price', NULL);