分析香港巴士路線之小實驗

香港政府一站通裡,運輸署提供了公共交通路線及收費資料的資料庫。可惜它提供的是 MS Access 的格式,還要將它分成多個 mdb 檔案,令處理變得困難。到最近,我終於找到開源工具 mdb-sqlite,幫我將 mdb 檔案轉成 SQLite 格式。

下載 Java 原始碼後先到解壓的資料夾將它編譯好:
ant dist

然後將全部 mdb 檔都轉成 SQLite 格式:
ls /home/user/publictransport/*.mdb | xargs -i java -jar dist/mdb-sqlite.jar {} {}.sqlite

再開一個新 SQLite 資料庫去將那些 SQLite 檔案結合成一個資料庫:
ATTACH DATABASE '/home/user/publictransport/COMPANY_CODE.mdb.sqlite' as 'COMPANY_CODE';
CREATE TABLE COMPANY_CODE AS SELECT * FROM COMPANY_CODE.COMPANY_CODE;
DETACH 'COMPANY_CODE';

ATTACH DATABASE '/home/user/publictransport/ROUTE_BUS.mdb.sqlite' as 'ROUTE_BUS';
CREATE TABLE ROUTE_BUS AS SELECT * FROM ROUTE_BUS.ROUTE;
DETACH 'ROUTE_BUS';

ATTACH DATABASE '/home/user/publictransport/ROUTE_FERRY.mdb.sqlite' as 'ROUTE_FERRY';
CREATE TABLE ROUTE_FERRY AS SELECT * FROM ROUTE_FERRY.ROUTE;
DETACH 'ROUTE_FERRY';

ATTACH DATABASE '/home/user/publictransport/ROUTE_GMB.mdb.sqlite' as 'ROUTE_GMB';
CREATE TABLE ROUTE_GMB AS SELECT * FROM ROUTE_GMB.ROUTE;
DETACH 'ROUTE_GMB';

ATTACH DATABASE '/home/user/publictransport/ROUTE_PTRAM.mdb.sqlite' as 'ROUTE_PTRAM';
CREATE TABLE ROUTE_PTRAM AS SELECT * FROM ROUTE_PTRAM.ROUTE;
DETACH 'ROUTE_PTRAM';

ATTACH DATABASE '/home/user/publictransport/ROUTE_TRAM.mdb.sqlite' as 'ROUTE_TRAM';
CREATE TABLE ROUTE_TRAM AS SELECT * FROM ROUTE_TRAM.ROUTE;
DETACH 'ROUTE_TRAM';

ATTACH DATABASE '/home/user/publictransport/RSTOP_BUS.mdb.sqlite' as 'RSTOP_BUS';
CREATE TABLE RSTOP_BUS AS SELECT * FROM RSTOP_BUS.RSTOP;
DETACH 'RSTOP_BUS';

ATTACH DATABASE '/home/user/publictransport/RSTOP_FERRY.mdb.sqlite' as 'RSTOP_FERRY';
CREATE TABLE RSTOP_FERRY AS SELECT * FROM RSTOP_FERRY.RSTOP;
DETACH 'RSTOP_FERRY';

ATTACH DATABASE '/home/user/publictransport/RSTOP_GMB.mdb.sqlite' as 'RSTOP_GMB';
CREATE TABLE RSTOP_GMB AS SELECT * FROM RSTOP_GMB.RSTOP;
DETACH 'RSTOP_GMB';

ATTACH DATABASE '/home/user/publictransport/RSTOP_PTRAM.mdb.sqlite' as 'RSTOP_PTRAM';
CREATE TABLE RSTOP_PTRAM AS SELECT * FROM RSTOP_PTRAM.RSTOP;
DETACH 'RSTOP_PTRAM';

ATTACH DATABASE '/home/user/publictransport/RSTOP_TRAM.mdb.sqlite' as 'RSTOP_TRAM';
CREATE TABLE RSTOP_TRAM AS SELECT * FROM RSTOP_TRAM.RSTOP;
DETACH 'RSTOP_TRAM';

ATTACH DATABASE '/home/user/publictransport/STOP_BUS.mdb.sqlite' as 'STOP_BUS';
CREATE TABLE STOP_BUS AS SELECT * FROM STOP_BUS.STOP;
DETACH 'STOP_BUS';

ATTACH DATABASE '/home/user/publictransport/STOP_FERRY.mdb.sqlite' as 'STOP_FERRY';
CREATE TABLE STOP_FERRY AS SELECT * FROM STOP_FERRY.STOP;
DETACH 'STOP_FERRY';

ATTACH DATABASE '/home/user/publictransport/STOP_GMB.mdb.sqlite' as 'STOP_GMB';
CREATE TABLE STOP_GMB AS SELECT * FROM STOP_GMB.STOP;
DETACH 'STOP_GMB';

ATTACH DATABASE '/home/user/publictransport/STOP_PTRAM.mdb.sqlite' as 'STOP_PTRAM';
CREATE TABLE STOP_PTRAM AS SELECT * FROM STOP_PTRAM.STOP;
DETACH 'STOP_PTRAM';

ATTACH DATABASE '/home/user/publictransport/STOP_TRAM.mdb.sqlite' as 'STOP_TRAM';
CREATE TABLE STOP_TRAM AS SELECT * FROM STOP_TRAM.STOP;
DETACH 'STOP_TRAM';

這樣,我們便得出一個方便我們查詢的資料庫了。

這次我們集中研究香港的巴士服務,所以只處理 ROUTE_BUS、RSTOP_BUS 和 STOP_BUS 資料表,為它們加上索引,又將 STOP_BUS 裡的 X 和 Y 座標轉成地理格式。利用之前介紹過的 spatialite_gui 打開剛才合併出來的 SQLite 資料庫可幫助我們進行處理。
CREATE INDEX idx_1 ON "ROUTE_BUS" (ROUTE_NAMEC);
CREATE INDEX idx_2 ON "ROUTE_BUS" (COMPANY_CODE);
CREATE INDEX idx_3 ON "ROUTE_BUS" (ROUTE_ID);
CREATE INDEX idx_4 ON "RSTOP_BUS" (ROUTE_ID);
CREATE INDEX idx_5 ON "RSTOP_BUS" (STOP_ID);
CREATE INDEX idx_6 ON "STOP_BUS" (STOP_ID);

SELECT AddGeometryColumn('STOP_BUS', 'Geometry', 2326, 'POINT', 'XY')
UPDATE STOP_BUS  SET Geometry = MakePoint(X,Y,2326);

我們利用這個資料庫可找出全港巴士站的分佈:
SELECT ST_Union(Geometry) FROM STOP_BUS;

bus_route1

結果當然亦可輸出成其他格式於 GIS 程式中使用:
bus_route2

找出九巴的巴士站分佈:
SELECT ST_Union(MakePoint(X,Y,2326)) FROM ROUTE_BUS
LEFT JOIN RSTOP_BUS ON ROUTE_BUS.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS ON RSTOP_BUS.STOP_ID = STOP_BUS.STOP_ID
WHERE COMPANY_CODE LIKE '%KMB%';

bus_route3

找出城巴及新巴的巴士站分佈:
SELECT ST_Union(MakePoint(X,Y,2326)) FROM ROUTE_BUS
LEFT JOIN RSTOP_BUS ON ROUTE_BUS.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS ON RSTOP_BUS.STOP_ID = STOP_BUS.STOP_ID
WHERE COMPANY_CODE LIKE '%CTB%' OR COMPANY_CODE LIKE '%NWFB%';

bus_route4

找出九巴 53 號線的巴士站分佈:
SELECT ST_Union(MakePoint(X,Y,2326)) FROM ROUTE_BUS
LEFT JOIN RSTOP_BUS ON ROUTE_BUS.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS ON RSTOP_BUS.STOP_ID = STOP_BUS.STOP_ID
WHERE COMPANY_CODE = 'KMB' AND ROUTE_NAMEC = '53';

bus_route5

找出九巴以 7 為首、X 為尾的路線巴士站分佈:
SELECT ST_Union(MakePoint(X,Y,2326)) FROM ROUTE_BUS
LEFT JOIN RSTOP_BUS ON ROUTE_BUS.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS ON RSTOP_BUS.STOP_ID = STOP_BUS.STOP_ID
WHERE COMPANY_CODE = 'KMB' AND ROUTE_NAMEC LIKE '7%X';

bus_route6

利用這段 SQL 更可以分出不同路線:
SELECT ROUTE_NAMEC, ST_Union(MakePoint(X,Y,2326)) FROM ROUTE_BUS
LEFT JOIN RSTOP_BUS ON ROUTE_BUS.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS ON RSTOP_BUS.STOP_ID = STOP_BUS.STOP_ID
WHERE COMPANY_CODE = 'KMB' AND ROUTE_NAMEC LIKE '7%X'
GROUP BY ROUTE_NAMEC;

輸出成 Shape File,更可在 GIS 裡分別顯示出來:
bus_route7
bus_route8

由於之後我們需要將香港大地座標,轉換成 WGS84 方便計算距離,也需要預先處理好全港每一個巴士站的服務範圍(假設為三百米),以加速資料庫存取,所以我們要預先加上兩列資料:
SELECT AddGeometryColumn('STOP_BUS', 'Geometry_4326', 4326, 'Point', 'XY');
UPDATE STOP_BUS  SET Geometry_4326 = ST_transform(STOP_BUS.Geometry,4326);

SELECT AddGeometryColumn('STOP_BUS', 'Geometry_Buffer', 2326, 'Polygon', 'XY');
UPDATE STOP_BUS  SET Geometry_Buffer = st_buffer(STOP_BUS.Geometry,300);

查出大埔中心某一點三百米範圍內所經過的所有巴士線:
SELECT ROUTE_NAMEC, COMPANY_CODE
FROM ROUTE_BUS
LEFT JOIN RSTOP_BUS ON ROUTE_BUS.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS ON RSTOP_BUS.STOP_ID = STOP_BUS.STOP_ID
WHERE PtDistWithin(Geometry_4326,MakePoint(114.1689,22.4518,4326), 300, 1)
GROUP BY ROUTE_NAMEC, COMPANY_CODE;

bus_route9

查出大埔中心某一點三百米範圍內所經過的所有巴士線及它們的巴士站位置:
SELECT ROUTE_NAMEC, COMPANY_CODE, ST_Transform(CastToMultiPoint(ST_UNION(Geometry)),4326)
FROM ROUTE_BUS
LEFT JOIN RSTOP_BUS ON ROUTE_BUS.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS ON RSTOP_BUS.STOP_ID = STOP_BUS.STOP_ID
WHERE PtDistWithin(Geometry_4326,MakePoint(114.1689,22.4518,4326), 300, 1)
GROUP BY ROUTE_NAMEC, COMPANY_CODE;

bus_route10

查出大埔中心某一點三百米範圍內所經過的所有巴士線,它們沿線的所有巴士站位置:
SELECT ST_Union(STOP_BUS.Geometry) FROM ROUTE_BUS AS a
LEFT JOIN
(SELECT Geometry, Geometry_4326, ROUTE_NAMEC, COMPANY_CODE
FROM ROUTE_BUS
LEFT JOIN RSTOP_BUS ON ROUTE_BUS.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS ON RSTOP_BUS.STOP_ID = STOP_BUS.STOP_ID
) AS b ON a.ROUTE_NAMEC = b.ROUTE_NAMEC AND a.COMPANY_CODE = b.COMPANY_CODE
LEFT JOIN RSTOP_BUS ON a.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS ON RSTOP_BUS.STOP_ID = STOP_BUS.STOP_ID
WHERE PtDistWithin(b.Geometry_4326, MakePoint(114.1689,22.4518,4326), 300, 1);

bus_route11
bus_route12

假設我們將每一個巴士站的服務範圍定為三百米,大埔中心某一點三百米範圍內所經過的所有巴士線可以直達的不重複面積為:
SELECT ST_Area(ST_Union(Result)) / 1000 / 1000
FROM
(SELECT c.geometry_buffer AS Result FROM ROUTE_BUS AS a
LEFT JOIN
(SELECT Geometry_4326,ROUTE_NAMEC, COMPANY_CODE, STOP_BUS.STOP_ID
FROM ROUTE_BUS
LEFT JOIN RSTOP_BUS ON ROUTE_BUS.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS ON RSTOP_BUS.STOP_ID = STOP_BUS.STOP_ID
) AS b ON a.ROUTE_NAMEC = b.ROUTE_NAMEC AND a.COMPANY_CODE = b.COMPANY_CODE
LEFT JOIN RSTOP_BUS ON a.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS AS c ON RSTOP_BUS.STOP_ID = c.STOP_ID
WHERE PtDistWithin(b.Geometry_4326 ,MakePoint(114.1689,22.4518, 4326), 300, 1)
GROUP BY c.STOP_ID);

結果是 69.88 平方公里。

全港的巴士站覆蓋面積為:
SELECT ST_Area(ST_Union(Geometry_buffer)) FROM STOP_BUS;

結果是 265.65 平方公里。

由此,我們所以計出大埔中心某一點的巴士服務可到達的「覆蓋率」,當然我在計算時忽略了很多重要因素,例如班次、路線是否特別班次、是否單方向、是否容易接駁其他交通、是否直達商業區等:
SELECT ST_Area(ST_Union(Result)) / 265654308.142386 * 100
FROM
(SELECT c.geometry_buffer AS Result FROM ROUTE_BUS AS a
LEFT JOIN
(SELECT Geometry_4326,ROUTE_NAMEC, COMPANY_CODE, STOP_BUS.STOP_ID
FROM ROUTE_BUS
LEFT JOIN RSTOP_BUS ON ROUTE_BUS.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS ON RSTOP_BUS.STOP_ID = STOP_BUS.STOP_ID
) AS b ON a.ROUTE_NAMEC = b.ROUTE_NAMEC AND a.COMPANY_CODE = b.COMPANY_CODE
LEFT JOIN RSTOP_BUS ON a.ROUTE_ID = RSTOP_BUS.ROUTE_ID
LEFT JOIN STOP_BUS AS c ON RSTOP_BUS.STOP_ID = c.STOP_ID
WHERE PtDistWithin(b.Geometry_4326 ,MakePoint(114.1689,22.4518, 4326), 300, 1)
GROUP BY c.STOP_ID);

結果是 26.31%。

經過計算,以下地區的巴士服務覆蓋率為:
地點		%
大埔中心		26.31
富亨邨亨耀樓	5.12
富亨邨亨隆樓	13.11
太和站		19.34
廣福邨		21.76
沙田市中心	27.97
大圍站		22.69
旺角站		45.13
彩虹邨金漢樓	44.97
元朗東		29.01
元朗西		30.41
荃灣站		29.37
荃灣西站		22.12
香港科技大學	3.75
香港濕地公園	8.17
香港文化中心	36.99

原來 spatialite_gui 也可以很方便地輸出到 PostGIS,而且 PostGIS 更是快一點,功能也更完整,只是那些 SQL 要稍作修改方能使用:
bus_route13

內部連結:
【目錄】地理/地理資訊系統/空間資料庫/大地測量內部連結

本文連結