Asynchronous calls to the openAI API

I am currently experimenting a lot with chatGPT and similar models. The openAI API provides methods such as “create”:

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

Which are blocking calls, execution will only continue when the call to the API has finished. For exploring prompt engineering and other LLM parameters slows progress significantly. In my experience the openAI API sometimes fails to respond at all (bad gateway errors) or can respond with other errors (rate limits reached etc). Luckily openAI provide an asynchronous method “acreate” which is non-blocking. This will enable me to fire off many requests to the API simultaneously and then gather the results at the end.

Asynchronous calls in Python can be a little confusing at first. Their are also conflicts with using asynchronous calls with many common Python IDEs, which are already using asynchronous IO behind the scenes. I won’t attempt attempt to describe the details of Python’s asyncio completely but give a rough outline. An asynchronous function can be paused while waiting on the a final result, for example while waiting on a network request. These kind of functions are called coroutines. While one coroutine is paused and another can start running. We define which parts of our code we want to wait for with the await keyword. We can define our coroutines with the async keyword.

Lets first create some prompts we want to fire at the openAI api

prompts = [{"role":"user","content":f"Please give me five {animal} names"} for animal in ['dog','cat','horse']]
for prompt in prompts:
    print(prompt)

The method acreate requires us to await the reponse. I we can first wrap the requests with a simple function, which calls acreate and awaits a response and then unpack the result from openAI:

async def makeRequest(prompt):
    completion = await openai.ChatCompletion.acreate(model="gpt-3.5-turbo",
                                                     messages=prompt)
    return completion.choices[0].message.content

Applying to this function to each prompt

crs = [makeRequest(prompt) for prompt in prompts]
crs

gives us a list of coroutines objects

[<coroutine object makeRequest at 0x000001ABF76009E0>,
 <coroutine object makeRequest at 0x000001ABF7567300>,
 <coroutine object makeRequest at 0x000001ABF7567CA0>]

As of yet nothing has happend, the makeRequest function has not yet called, we have a list of coroutines ready to be launched. We can use asyncio.gather to gather the outputs into one big list.

async def gatherWrapper(crs):
    R = await asyncio.gather(*crs)
    return R

R = asyncio.run(gatherWrapper(crs))

RuntimeError: asyncio.run() cannot be called from a running event loop

Something has gone wrong here. I believe this is because IPython is also running its own event loop behind the scenes. Other environments like Jupyter notebooks and GUIs have there own event loops running in the background. There is a simple workaround for this, nest_asyncio patches asyncio to work around this. Putting these lines at the beginning will detect if we need to apply this patch, and apply it.

if asyncio.get_running_loop().is_running():
    import nest_asyncio
    nest_asyncio.apply()

Now looking at our response from openAI

R = asyncio.run(gatherWrapper(crs))
for r in R:
    print(r+"\n")

1. Bailey
2. Max
3. Luna
4. Charlie
5. Cooper

1) Luna
2) Simba
3) Mittens
4) Whiskers
5) Felix

1. Thunderbolt
2. Shadowfax
3. Black Beauty
4. Starlight
5. Silverado

Code for this demo is available here.

Asynchronous calls to the openAI API

Comments

Leave a Reply Cancel reply