From OpenGeofiction Encyclopedia
Jump to: navigation, search

16.01.2019 - Auto-Calculating Inter-City Travel Passenger Numbers V.2

Hello everyone, buckle up for a ride.

If you remember my diary entry from... over a year ago, you might already know that I'm a little obsessive with calculating realistic passenger numbers for Kojo's inter-city high speed trains. I've now finished the first part of a new excel spreadsheet to do just that, and need general feedback and advise, which I will come back to at the end of this diary. I'm more or less copying a classical gravitational model, which means that traffic between nodes is proportionate to the two cities' population as well as dependent on the distance (just like the gravitational force, which is a function of two bodies' masses and the distance between them). However I'm derivating from the standard approach at some points, especially in the second section. I will also add some footnotes, to point out some other sources of error.

First Step: We need to divide Kojo into cells, attribute every cell with a population, and then span a matrix which shows the distance between every two cells ("Aufwand" in my excel table, as a mean to measure the effort it takes to travel between the two cities[1]) . Because we can't just continue with that distance, we need to transform this resistance into a rating ("Bewertung"), which is where the magic takes place: B=A^(-α), where alpha is a parameter indicating how sensitive travelers are reacting to an increase in distance. If alpha grows, people will travel more to closer-by destinations, and less to those far away. The excel sheet includes a small section to play around with this and find a suitable parameter. I've heard of realistic values being in the range of 0.3- 0.4 , where 0.4 for example means that if there are two cities with equal population, one being 300 km and the other 600 km away from a third city, 1.32 times as many people will travel to the city closer by than to the one further away. Now we have a matrix what contains this "Bewertung" for every point to point relation in Kojo. Yay!
Kojo cells.png The air-distance matrix converting A into a rating the sandbox for parameter testing

Second step: Obviously you could argue that it's the choice of every world (or country-) builder to determine how popular a given mode of transportation is in a country, and in a way I'm doing exactly that; because next, I'm feeding the system with how many (return-) trips per capita per year there are on high-speed trains in Kojo[2] (in this case: 2 trips, which means 4 ways, which is roughly twice as much as it is in Germany). To determine how many trips each city spawns and attracts[3], we then calculate the city's share of the total population and multiply it with the total number of trips. For my case, I added an extra step in-between, where I adjust the population once for trips being spawned in a city, and once for trips being attracted to that city. There are two reasons for that: firstly, some cities (most notable centers of business or tourism) attract more trips per inhabitant than more "boring" cities, even after adjusting for population. Secondly, we must not forget about the main diagonal in the final traffic matrix; i.e. trips "from A to A"; or rather, we do forget about them (setting them as zero), and just say "well, people in Pyingshum already have a big fairtrade/concert venue/international airport, so they will need to leave the city (= cell) less frequently than someone from Yamatsuma who lives in the middle of nowhere", and just secretly lower Pyingshum's population for the spawned-trips-calculation.
Cell overview

We end up with a list of spawned and attracted trips per day per cell ("Quelltrips" and "Zieltrips"). Our next and (for now) last step is to spread all trips spawned from let's say Kippa over all other possible destinations, and to so in proportion to every destination's share of attracted trips as well as the share of that relation's rating to all ratings. I won't attempt to explain the maths here, but it's not as hard as it sounds and I would encourage you to have a look at it yourself. You can find the spreadsheet if you follow this link.
We end up with the desired matrix that shows us exactly how many people will travel per day from cell i to cell j. As expected it's symmetrical, meaning that when 12565 people a day travel from Pyingshum to Kippa, 12565 will also travel in the other direction (on average).
End result

Now comes the part where I need help; I've already sketched out a new map showing IC and CC-train routes in Kojo, and I would then like to turn over the passenger numbers onto some kind of network graph, so I can get an idea how many trains I need on which sections and on which train lines. I suppose I could do this manually, by making a long list of every individual train line and its route sections and then manually assigning every start->destination relation with the segments the passengers would travel on. Do you know of an easier way, or perhaps a digital tool to manage such data? I'm sure there must be, and I'd be grateful for any advise you might have for that or comments on the work I presented here.

TL;DR: Enjoy.

  1. Note we are simplifying by using air-distance. Of course realistically, travel time or generalised travel costs would be better
  2. This is an important difference to the way it's done in the real world; there, travel demands are estimated first, and then you try to predict how the population spreads these ways over the different transport modes (car, train, plane, bike...). Therefore, one important shortcoming of my model is that it will overestimate train traffic on routes that are more comfortable to travel by other modes; let's say Shangme to Oreppyo, where no one with a sane mind would take a train via Pyingshum instead of just taking a bus. Since the rail network in Kojo is supposed to pretty extensive however, and because the largest part of the population lives in urban centers well served by high-speed trains, this error should be negligible. I might also adjust that in the second part of this project, but more about that later.
  3. This step doesn't take place in the standard model; there it is irrelevant where a trip is "spawned", instead the standard model only differentiates between originating and terminating traffic.


Please leave your thoughts below this line

This is so cool! I am so envious of both your patience and how skilled at work with data you are. Though the calculations might be slightly different from what would actually happen in a real world situation, I still think it's interesting and could be helpful with planning a country's infrastructure network. --Eklas (talk) 21:33, 16 January 2019 (CET)

This is very good mathematical model of transportation. In fact mostly the same method is used in prognosis of the number of passengers of proposed railroads and cars in proposed roads (but here also other factors like competition between various transport means) as well as general studies of people moving in area but in the second case the theoretical number is corrected by complex surveys which obviously are impossible to do in OGF. --Rüstem Paşa Discussion 21:52, 16 January 2019 (CET)

Thanks for including Ataraxia in this great project! Interesting to see that Ataraxia City is the 3rd busiest city in your "national" network. Just a note on population - 7 million is roughly the urbanized area tagged 'residential' but that doesn't include provinces like Delmar (Marbella) or Western Lainiere. The population of "region 6" might be 8-10 million depending on how you count (including Vendôme and Domaine Royal to the South). But I see that you are also accounting for the change in travel demand due to language/currency/economic system at our border that limits somewhat the size effect.

I understand the gravity model concept, but the results it gives are still curious. Geryong is within commuting distance from central Fenelec, but the model gives more daily travel all the way to Ataraxia City. The model shows almost 3x more travel from Ataraxia City to Pyingshum by train than to Finkyase, but Pyingshum-Ataraxia City should be a busy air route and Finkyase-Ataraxia City is at the sweet spot for rail.

Is the idea to refine service patterns for high-speed rail or planning for new services? Dono87 (talk) 00:26, 17 January 2019 (CET)

Hey Dono! Not including my big neighbour was perhaps the biggest flaw in the previous version, so I made sure to correct that. The travel-demand-adjustment really is the Archilles' heel of the whole thing; because it's up to our imagination. I might set it even lower, but the traffic volume should be pretty considerable nonetheless.
As for the Sappaer-iki -> Feneix/Ataraxie-Ville discrepancy, my current model calculates 142 and 426 trips respectively per day per direction; That's three times the amount of traffic going to Ataraxie-Ville, while AV has 4.5 times the population of Feneix in my current model. I guess this could mean I need to adjust the parameter to make passengers more sensitive to distance, currently the distance only has a small impact on the travel patterns. But also keep in mind; I justify the low estimates of the model at low distances by arguing that daily commuters will use more local trains, and these obviously don't show up here. So the amount of people commuting between Fenelec and Geryong is surely higher than 142 per day ;)Leowezy (talk) 19:33, 18 January 2019 (CET)

Wounderful. And after that, there is a next step: "Kosten-Nutzen-Faktor". That means, there is the necessary of train-rides in the calculation you made on the one side and on the other the cost of the rails, you need for this traffic. After that step you get a priority-list for lines, you have to build. Die Kosten-Nutzen-Analyse sagt dir dann den volkswirtschaftlichen Wert der zu bauenden Strecken. Es sind ja nicht alle Strecken gleichwertig und manche seltener nachgefragten Verkehrsrelationen fallen aus dem Raster. Diese Strecken werden nicht gebaut..--Histor (talk) 01:53, 17 January 2019 (CET)

Hallo Histor, danke für dein Feedback! Die Möglichkeit einer volkswirtschaftlichen Betrachtung ist sicher interessant, und überschlägig sollen die Daten natürlich auch genau dabei helfen festzustellen, wo sich Verbindungen und Strecken lohnen. Komplett durchrechnen würde ich das allerdings nicht können, da mir dazu einfach zu viele Input-Daten fehlen (Baukosten, terrain auf der einen Seite, Volkswirtschaftlichen Nutzen durch Verkehrsverlagerung, Erschließungsfunktion auf der anderen, etc.) und das auch extrem aufwendig wäre. Aber Gott sei Dank ist es im echten Leben ja auch so dass nur allzu oft gerade die großen Infrastrukturprojekte mit einer eher fragwürdigen NK-Rechnung hinterlegt sind ;)Leowezy (talk) 19:33, 18 January 2019 (CET)

Everyone, thank you for your lovely words of encouragement! I'm still a little helpless regarding the next steps though, namely how to easily transfer those numbers generated onto a network graph. Because as you can see from the comments, there's always lots to recalibrate and improve, and I'd like to avoid having to process all that data again and again.Leowezy (talk) 19:33, 18 January 2019 (CET)

I'm nowhere near interested enough in transportation infrastructure to have an informed opinion on this, but as someone who spends a LOT of time making spreadsheets this is really awesome to see. I wish more people shared their worldbuilding spreadsheets on here! We could have a whole subforum...hehehe... -- LW (talk) 19:49, 18 January 2019 (EST)

I may never do quite as you have done, but you have my utmost respect for such meticulous work. I know whom to contact when I'm ready to put more work in on public transport infrastructure in my territories.--Luciano (talk) 03:07, 19 January 2019 (CET)

Here's where I start to complicate things for you: I need to know what exactly you're calculating - total number of trains running, I guess? and more importantly over what average timespan. Yearly, daily; and mean, median or modal average? You realise that there are going to be times of 'surge' in demand which all transportation companies build into timetables (although highspeed rail is one of those with less dramatic changes) - public holidays and festivals when there is either an increased service (bank holiday weekend) or no service (eg Christmas Day), and then at commuting times of day vs the night. Also, depending on the railway company in Kojo, there may be discount incentives to travel at less busy times to even out the passenger levels. Maybe your formula can accommodate some 'anomaly' data to allow for this?Sarepava (talk) 11:52, 19 January 2019 (CET)

Hey Sarepava, thank you for your interest! The spreadsheet only calculates the number of people travelling between any two given cells. The numbers are per day, and a year-round average. When calculating the number of trains running, I first think about what THC train set will be used, and for a normal week day just divide the trains capacity by two; Deutsche Bahn operates its long-distance services at roughly 50% capacity, which obviously means that during Fridays or holiday start trains are more crowded than on Sundays, where less than 50 % of seats are taken. Of course I also try to keep in mind that busy IC routes will have a higher degree of utilisation than most CC routes, but I haven't gotten to that step so far outside of some approximations.Leowezy (talk) 16:19, 19 January 2019 (CET)

What you’re doing seems similar to my attempt to determine a somewhat reasonable aircraft fleet size using a simulated airline network. I started by estimating passenger traffic between two cities based primarily on distance, and city and country populations, and then wrote scripts to move aircraft between city pairs. I added aircraft in to the simulation until the demand on the routes was filled. Departure times were adjusted to minimize aircraft idle time. Eventually I had a fleet number and aircraft mix that worked. Rail seems like it would be more complicated since the network has more than just back-and-forth travel, and passengers are getting on and off along the way. But you might be able to do something similar. For my simulation I used FileMaker Pro to manipulate the data. I’m sure there are better tools, and most likely something that’s specifically made for scheduling. If you’re interested in more information on what I did, let me know and I’ll try to outline it in more detail which might suggest a solution to you. --Paxtar (talk)

Hey Paxtar, that sounds like it could be really helpful. Would you mind to elaborate?Leowezy (talk) 20:35, 20 January 2019 (CET)

Here’s the process I used to create the airline schedule. In a relational database I set up the following tables:


Location (lat/lon)
Importance (0.0 - 1.0, based on culture, interest, friendliness, etc., to adjust passenger demand)


Cities on Route (something like: NYC-MUC, MUC-LAX, or LAX-NYC) Each direction counted as a different route.
Calculated Demand (passenger demand per week, based on city data )
Calculated Supply (seats available per week based on schedule and aircraft type)


Flight Number
Route Taken (something like NYC-MUC or LAX-CHI-MUC)
Aircraft Type used
Scheduled Departure Time
Actual Departure Time
Estimated Time (calculated based mostly on distance and speed of aircraft type)
Actual Arrival Time


Aircraft Model


Tail number (for tracking individual aircraft)
Type (pulls model speed and capacity from Type table)
RouteType (domestic or international)
Location (current city, or next city if enroute)
Home City (initial position of aircraft at start of simulation)
Minimum turn-around time (how long the aircraft cannot be used while it is being unloaded, refueled and reloaded)
Status (available at current city, OR enroute to next destination, OR turn-around)
Time (time at which aircraft arrives at next city OR time at which turn-around is complete)

With that information the script goes through the schedule minute by minute, starting at 00:00 on Day 1. All schedule times are GMT, with local times displayed in the published schedule.

1. Aircraft with Time matching current minute are updated. Others are ignored.

If the status is ‘Enroute’ then:
Status is updated to 'Turn-around’
Time is updated to to current time + minimum turn-around time
Actual arrival time is logged
Usage is logged, flight minutes + turn around time
If the status is ‘Turn Around" then:
Status is updated to Available
Time is blanked

2. All scheduled departures matching current minute are added a pending queue.

3. Pending departures are processed in order of how they entered pending queue

If an eligible an aircraft is available for a pending departure then:
Aircraft Status is updated to Enroute
Location is updated to the destination city
Time is changed to its arrival time based on based on speed and distance
Actual departure time is logged.
If no aircraft is available, the flight remains pending

After the day/week/month has been processed, the log of scheduled times can be compared to actual departures and used to adjust capacity supply at different cities, or alter departure times, to make the schedule more efficient. The flight time log helps determine the efficiency of the fleet. Unused or underused aircraft are moved, or removed from the fleet.

I used the data from the schedule to create the animated air traffic gif from several years ago.

I've tried to simplify the explanation as much as possible. I'm sure a simliar scheduler could be worked out for a rail system with modifications for multiple cities on routes using pre calculated city pair distances, rather then direct distances. I'd be willing to share the Filemaker Pro database I used, but it's not user friendly since I made it for my own use, and didn't create any documentation for how it works. The scripting language it uses isn't especially difficult, but does take a little time to learn.

--Paxtar (talk) 05:56, 26 January 2019 (CET)

But the animated version sure can be used for trains so as for airplanes, I think. So p.e. for the AVE-hi-speed trains in Latina (each line in a different colour) or for the commuter-lines of Stanton / FSA. I am interested. --Histor (talk) 10:30, 26 January 2019 (CET)
The animated gif was created by tracking the location of every aircraft at time points in the simulation, usually every 10 minutes. Enroute aircraft locations were determined by calculated their position along their great circle route to the next city.
A blank array was created at the size of the map image, and the XY location (calculated from lat/lon) of each aircraft was marked using a hex color code different for each type of aircraft. Surrounding XY points were also marked to make the positions more visible.
The array with the data was saved as a PBM File, with one file for each time point. All the PBM files were opened in Photoshop and points with value 0 (black) were changed to transparent, and the OGF map was placed in the background. The resulting files were loaded as an image sequence and saved as an animated GIF.
The same process could be done with a train schedule, but since trains aren’t moving in straight lines, positions couldn’t be calculated without pre-loaded routes in the database. Node data for each route could be pulled from an OSM file. A way between two cities would provide a list of nodes, which would result in a list of lat/lon locations. Using time and speed, the train’s location along the ‘way' could be calculated and marked on an image.
The most complicated part of adapting the flight scheduler to rail is tracking and managing routes with intermediate stops. LAX-NYC is simple. LAX-LAS-DEN-MCI-CHI-CLE-PIT-NYC is a little more complicated. It is similar to using a relation made up of separate ways, compared to just using one way.
It’s not a project I have time to take up right now, but I am willing to help. —Paxtar (talk) 18:17, 26 January 2019 (CET)
Some time ago we had a lengthy diary thread exploring the possibility of creating an animated Leaflet map, containing real-time tracking of public transport lines. I'd really like to see something like or for OGF. I guess your database could be the starting point for that. Thilo (talk) 21:13, 26 January 2019 (CET)
Thank you Paxtar for your insights, and thanks Thilo for the reminder about the past discussion (btw, do you know why the maps for Pyingshum's metro and commuter rail network aren't working anymore?).
Syntax error in the "multimaps" definition, i.e. the trailing comma after the last element ("16") of "overlaydef" (strict JSON required here). Thilo (talk) 20:37, 27 January 2019 (CET)
Oh my God thank you so much! I would have never spotted that.Leowezy (talk) 20:40, 27 January 2019 (CET)
I personally feel like, while a project like yours seems very interesting and useful to get to know some basic coding etc., it would be a little overkill for my ambitions. Your explanation made me understand that perhaps such an undertaking is even more complicated than I thought; not meaning to discourage others of trying to replicate it. I think for the case of high-frequency high-speed rail lines in Kojo, I will just manually add up the most important traffic flows to get a realistic estimate for train capacity and frequency. But I'm definitely in awe of the amount of work that went into your scheduling and the visualisation of the resulting air traffic.Leowezy (talk) 20:01, 27 January 2019 (CET)

I like the idea of having real-time train locations appearing on the map, and updating with every reload of the page. Once the actual schedule is worked out, scripting something to generate a minute by minute list of train locations wouldn't be complicated. I’m guessing that the table would need to contain: operator, train number, day of week, time, latitude, and longitude.
How common is it for one train number to be in use more than once on a day? For example, Amtrak on the U.S. west coast usually has two train 14s running on the same day. One departing from Los Angeles in the morning, and the other from the previous day not yet having completed its 35 hour trip to Seattle due to its blazing average speed of 63 km/h. If the F.S.A. is similar to the U.S. in that is has a slow rail system that would need to be taken in to account.
@Thilo: Can data be pulled from a static external source to mark a point on the map?
@Histor: The existing process, with updates, could also be used to generate an animated gif for a rail system. That would probably be easier than a live version. How many trains and routes?
@Leowezy: It was overkill for me, especially since my original motivation was just to determine a size for an airline fleet. :)
For Latina a lot of trains (10 lines and every hour a train). I will think about --Histor (talk) 23:22, 2 February 2019 (CET)