Third Party Developer Blog

Sep

ESI, Concurrent Programming, and Pagination

CCP Zoetrope | 2017-09-08 14:42

This blog post is part of a series of blogs examining best practices for ESI development. Each blog will be published on the 8th of each month during the journey towards XML API and CREST’s termination date. The legacy APIs will be terminated on May 8th, 2018, or earlier if metrics signal a trivial level of usage.

This blog will cover the concept of concurrent programming and how to use this model of programming to improve your software's performance while pulling data from ESI. If you are already familiar with the concept of concurrent programming here is a ESI pagination TLDR: Use the X-Pages header returned from paginated ESI endpoints to determine how many calls are needed to get all data from a given endpoint.

Prerequisites

This blog includes examples of Linux commands as well as Python code, therefore if a reader wants to follow this blog exactly then they will have to do so on a Linux based system and use Python 3. It is also assumed that the user has both pip and virtualenv installed.

Have you ever used ESI's /markets/{region_id}/orders/ endpoint? Did you know that this endpoint is paginated and allows you to retrieve more than just the first 10,000 orders? Did you know that there are more ESI endpoints than this that support pagination? If the answer to one or more of these questions was no, then this blog is for you! If you answered yes but you have never been quite sure how you are supposed to utilize pagination in ESI, then this blog is also for you! Using the /markets/{region_id}/orders/ endpoint this blog will walk you through getting all orders from a specific region in a concurrent manner. This method can then be used for other paginated ESI endpoints or applied generally to concurrently fetch information from multiple endpoints.

How ESI Handles Pagination

ESI handles pagination the same for most^[1] endpoints, here's a high-level example of getting the paginated data in the market orders endpoint:

Send a request to https://esi.tech.ccp.is/v1/markets/10000042/orders/ to get the first 10,000 orders in the Metropolis region.
ESI returns a response that contains 10,000 orders along with the header X-Pages which has a value of 10. This is telling you that all active orders in this region are split into 10 pages.
Knowing that there are ten pages, you can now send nine more requests (because you already have page 1) starting at page 2 and incrementing up to 10. The way you do this is by adding a page query parameter to the URL.
After making these ten requests you should have all orders in Metropolis.

Remember that the number returned by X-Pages will vary depending on the time of day and region, and is simply a way of letting you know how much more data there is. Each paginated endpoint in ESI will possibly have a different max number of items per page which is indicated by the maxItems property in a given endpoint's swagger spec.

How to tell if an ESI endpoint is paginated

If you want to know if a particular ESI endpoint is paginated you will see "page" listed as a parameter for a given endpoint in its swagger spec.

Setting up your Environment

Before moving on you must set up an appropriate environment. In a directory of your choice, make a new virtual environment:

$ virtualenv .venv --python=python3

Activate this virtual environment:

$ source .venv/bin/activate

Next, run the following pip commands to install the libraries necessary to run the examples in this blog:

(.venv)$ pip install grequests

Getting Market Orders Synchronously using Python

We're going to first get all market orders in Metropolis (region ID 10000042) in a sequential manner. The following is pseudocode to show what we will be doing:

call https://esi.tech.ccp.is/v1/markets/10000042/orders/
extract the X-Pages header from the return
for page_number in pages - 1
    call https://esi.tech.ccp.is/v1/markets/10000042/orders/?page=page_number+1
combine the data from all calls

If the value returned by X-Pages was 3, then we could visualize this program like so:

where each colored block represents a task over time.

Make a new file called get_sequential.py and paste this code inside:

Make sure you're still inside the virtual environment you created earlier and run this by doing

(.venv)$ python get_sequential.py

This code will display how long it took to complete all the requests needed to get all of the orders and how many orders it retrieved. Here is the output from running it on my machine:

Elapsed time for 9 requests was: 4.024033069610596
Got 91,064 orders.

As we can see it took about 4 seconds to get 9 pages sequentially (remember that it skips calling the first page again). Why is that? It's because, for every request sent, the requests library waits for the response from the endpoint. Basically, a lot of the time spent is because of network latency. This is easier to visualize by adding wait time to the previous visualization (the gray areas signify wait time):

In this case, "wait time" is considered to be the time that our code is just waiting for a response from ESI. We could instead take advantage of this wait time and let our code perform more tasks that do not depend on the data returned from ESI. If we were to do this, the code would be considered concurrent.

Getting Market Orders Concurrently using Python

How would retrieving paginated market orders work in a concurrent manner? Which tasks could be done without needing to know about the other? Because the wait time is mostly network related, we could instead start each ESI call one after the other and consume their returns at a later time. Here's pseudocode to show how we will approach this concurrently (comments are preceded by the # symbol):

call https://esi.tech.ccp.is/v1/markets/10000042/orders/
wait for the response from ESI # because we need the value of X-Pages
extract the X-Pages header from the return
for page_number in pages - 1
    call https://esi.tech.ccp.is/v1/markets/10000042/orders/?page=page_number+1
    defer waiting for the response
consume all responses returned from ESI
combine all the orders

Here is a modified version of the previous visualization to represent this:

Remember that this is a simplified visualization and may not represent exactly how these given tasks are scheduled by gevent. Read more about gevent and greenlets here.

How is this done using Python? For this blog we will be using a library called grequests that in turn depends on the libraries gevent and requests. gevent is a networking library for Python that allows network calls to be started, suspended, and resumed independently in an event loop. grequests combines the requests library with gevent, essentially allowing HTTP requests to be started and consumed at different times. Put the following in a file called get_concurrent.py:

As can be seen, grequests does all the heavy lifting as far as the concurrent logic is concerned, and the magic of the concurrent calls is done particularly in the grequests.map() method that is called inside the concurrent_requests method. grequests.map() simply needs Request objects so that it can then call all endpoints and consume the returns from ESI at a later time. If you're interested in understanding further how grequests.map works you can read the code.

Run this by doing the following command:

(.venv)$ python get_concurrent.py

and you should get output similar to:

Elapsed time for 9 requests was: 0.554999828338623
Got 91,064 orders.

As we can see, the time it took to make the requests dramatically dropped. When this was done sequentially it took about 4 seconds and when done concurrently it took about 0.5 seconds.

Conclusion

Making calls to ESI in a concurrent way will dramatically speed up your software. Hopefully, the demonstration in this blog can serve as a jumping off point for you to continue on a concurrent path. There are currently only two other endpoints in ESI that handle pagination with the X-Pages header. The first is the /character/{character_id}/blueprints/ endpoint which recently got the addition of pagination and is currently only in the /v2 and /dev namespace. The second endpoint is /markets/{region_id}/types.

There are other paginated endpoints in ESI but they do not support the X-Pages header as of yet. However, Team Tech Co. plans to expand these endpoint's functionality to use the X-Pages header. These endpoints are:

CCP Zoetrope

_{[1] The following endpoints could be considered paginated but operate differently:}

_{https://esi-test.tech.ccp.is/latest/#!/Killmails/get_characters_character_id_killmails_recent uses max_kill_id and max_count to allow you to limit data returned.}
_{https://esi-test.tech.ccp.is/latest/#!/Wars/get_wars uses max_war_id to allow you to limit data returned.}
_{https://esi.tech.ccp.is/latest/#!/Killmails/get_corporations_corporation_id_killmails_recent uses 'max_kill_id' to allow you to limit data returned}

Appendix

The following paths have been added or updated in ESI since the previous blog:

GET /latest/corporations/{corporation_id}/wallets/ (v1)
GET /latest/corporations/{corporation_id}/membertracking/ (v1)
GET /dev/alliances/{alliance_id}/ (v3)
GET /latest/fw/systems/ (v1)
GET /latest/fw/leaderboards/corporations/ (v1)
GET /latest/fw/leaderboards/characters/ (v1)
GET /latest/fw/leaderboards/ (v1)
GET /latest/characters/{character_id}/planets/{planet_id}/ (v3)
GET /latest/characters/{character_id}/notifications/contacts/ (v1)
GET /latest/universe/systems/{system_id}/ (v3)
GET /dev/characters/{character_id}/blueprints/ (v2)
GET /dev/characters/{character_id}/roles/ (v2)
GET /latest/corporations/{corporation_id}/killmails/recent (v1)
GET /latest/corporations/{corporation_id}/wallets/{division}/journal (v1)
GET /latest/corporations/{corporation_id}/divisions/ (v1)
GET /latest/corporations/{corporation_id}/members/limit (v1)
GET /latest/corporations/{corporation_id}/contacts (v1)

Get a sneak peek at what's coming to ESI by watching this board.

back