An Alternative OpenAI API Client: Service Version
Building An OpenAI API Client Service: Comprehensive Walkthrough
Introduction
One thing about OpenAI, they have made an API that is very easy to use. Assuming that you are comfortable with Python, using their most common text completion API takes as little as two lines of Python:
import openai
response = openai.Completion.create(prompt=prompt,
engine="text-davinci-002", temperature=0.4, max_tokens=1024)
The actual software development effort consists primarily of putting these two lines into a context where the prompt
and the response
can be useful. The web has a number of posts that outline the steps for placing this code into a Python application. It’s a great place to start, and it should be adaptable to most existing environments.
But what if you don’t have any of this context? I’ve put together some modest Python applications in the past, but my recent work has largely been Java additions to existing services. What if your goals are different? A Python application makes for a simple demo, but a service can provide more value.
These notes sketch out 10 steps in the development of a working deployed cloud service. Starting with a blank development machine (e.g. a nice laptop), I work through the steps of preparing the development development, preparing the cloud environment, and deploying a demo service for OpenAI API integration. You might follow these notes on your own service development effort, or use it as a resource to resolve any of the unexpected challenges that arise in your independent efforts.
Project Goals
For context, the demo service arose through the combination of a nascent idea for educational software, the arrival of my API key, and the opportunity to learn some new technologies. These circumstances brought into focus a set of clear objectives for the demo service.
Development on Windows 10.
Use Visual Studio Code as the IDE.
Use the “native” Python API from OpenAI.
GitHub for version control.
Use Google Cloud.
Use PowerShell (vs. bash).
These development tools and deployment contexts are fairly arbitrary choices. They're driven largely by personal taste, weird loyalty twerks, industry standards, product linkage, and the opportunity to learn the cool new stuff. I’d be delighted to see contributions that fill out other paths (e.g. other cloud providers, AWS?).
Regardless of your choice of tools, getting a demo service operational requires the creation of a lot of context. Considered from the perspective of setting up a brand new development system for a brand new service, there are a large number of components that need to be configured.
Windows 10. It’s different from Unix, Linux, and macOS. I hear Windows 11 is similar.
For the IDE, you’ll need to install it and configure it for Python editing and debugging.
For Python, you’ll need to install a Python interpreter and pip. You’ll probably want to set up a “virtual environment”.
To use the OpenAI API, you’ll need to get an API key. It took me about six weeks, but I suspect things are faster now.
To use the Python API, you’ll need to download it and figure out how to provide an API key at runtime.
To use GitHub, you’ll need an account and repository creation privileges. You’ll want to set up SSH credentials to simplify check-ins (e.g. git push) and other operations (e.g. git push). After creating a repository at the server, you’ll want to set up a local clone repository that is accessible to the IDE.
To use Google Cloud, you’ll need an account and repository creation privileges. After creating a new project on Google’s cloud, you’ll need to download their CLI tools. You’ll also need to set up a Dockerfile for the installable image.
For GitHub and Google Cloud, a simple account registration with these providers is sufficient. For these small-scale experiments, the “free” quality of service is sufficient for single-user development. Access to an OpenAI API key requires a paid subscription, about $10/mo.
Platform
Ready access to a development class Windows 10 laptop made this choice purely practical. Good versions of all the necessary tools seem to be available for the common platforms. It’ll be a good learning experience after several years in the Unix, Linux, and macOS universe.
For some security control, the main development account should rarely need special privileges. However, installing software and remote processes debugging requires access to elevated privileges.
Activating a Python virtual environment also requires permission to execute local scripts. The Get-ExecutionPolicy -list
command lists the current script execution policies. If the value for all Scopes
is Undefined
, the command Set-ExecutionPolicy RemoteSigned
(or Unrestricted
) should allow activation of a Python virtual environment.
IDE Set Up
For many years, I’ve been a devotee of Eclipse. But Visual Studio for Code (VSC) seems to be the cool newness.
Installing and configuring VSC is fairly straightforward. Microsoft makes the download available directly on the main web page https://code.visualstudio.com/. VSC is delivered as a standard Microsoft installer. Make a few choices, hit finish, and the basics of the VSC IDE are installed and operational.
VSC does not include Eclipse’s heavyweight notion of a workspace. Any folder can be opened, and it immediately becomes the workspace. VSC does store some settings in a .vscode
directory, but this file is not required.
To work effectively with Python, several VCS extensions need to be installed. These include the Python extension, which brings in a handful of other components. While we’re installing extensions, Docker is used in the deployment steps and can be added now.
Installing the VCS Python extension will trigger an alert to install Python. The various prompts send you to the Microsoft store and another installation process. Painless enough, but yet another step.
PIP and Virtual Environment
The basic Python installation only includes a few packages, such as os. The initial “Hello, World” demo server will need two additional packages (Flask
, and gunicorn
). Later integration with the OpenAI API will add the openai
package.
pip
is the package installer for Python. By convention, the required packages and their versions are kept in a requirements.txt
file. The initial contents for the starter service are
Flask==2.1.0
gunicorn==20.1.0
The pip
install command will acquire these packages from the standard repositories (e.g. Python Index PyPi).
pip install -r requirements.txt
If you plan to work on multiple Python projects, perhaps with different version dependencies, a virtual environment is recommended by Python aficionados. The virtual environment can isolate one project from another.
I think it works like this. The commands below will set up the virtual environment and activate it. If script execution is disabled (e.g. Get-ExecutionPolicy
shows Undefined
), then you will need to modify the ExecutionPolicy
as described in the Platform section.
python -m venv .venv
.venv\Scripts\Activate.ps1
I did this once, on a different machine, and it seemed to work. But my current development machine is being cranky with Python in PowerShell and I cannot confirm that these commands do the right thing. Your mileage may vary.
Hello, World!
With the basics in place for Python development, it is time to create an initial service.
Google’s quickstart for Python as a Cloud Run service is a straightforward and simple set of steps. [https://cloud.google.com/run/docs/quickstarts/build-and-deploy/deploy-python-service]
These steps are where Google Cloud CLI is installed and used. The Google Cloud CLI is delivered as a standard Windows installer that prepares the Google Cloud CLI for PowerShell.
Google’s sample helloworld/main.py
is a good place to start for the basic framework. These lines, and a few additional files all go into a single helloworld
directory.
import os
from flask import Flask
app = Flask(__name__)
@app.route("/")
def hello_world():
"""Example Hello World route."""
name = os.environ.get("NAME", "World")
return f"Hello {name}!"
if __name__ == "__main__":
app.run(debug=True, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))
Google has made the perfectly reasonable decision to encourage Flask
as the framework for Python-based HTTP services in their execution environment. It saves me a bunch of research and decisions. I’m sure Django and other alternatives work well, too.
Although this code is intended for deployment to Google Cloud Run, it executes locally as well. VSC can debug the source file (main.py
) quite satisfactorily. It’s a great comfort to know that breakpoints can be set where needed as the code grows more complex.
Deploying a Service
Getting the “Hello, World” demo service deployed as a Google Cloud Run service requires a few more steps. These steps are nicely itemized in Google’s quickstart guide. Basically, the goals are integration with Docker to create a container and pushing the container to the cloud for deployment.
Integrating with Docker requires a Dockerfile
to specify the inclusions and a .dockerignore
file to specify the exclusions. Because Docker is building a Python service, the Docker integration requires some Python integration. All of this content is provided in Google’s quickstart guide.
# Use the official lightweight Python image.
# https://hub.docker.com/_/python
FROM python:3.11-slim
# Allow statements and log messages to immediately appear in the logs
ENV PYTHONUNBUFFERED True
# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . ./
# Install production dependencies.
RUN pip install --no-cache-dir -r requirements.txt
# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
# Timeout is set to 0 to disable the timeouts of the workers to allow Cloud Run to handle instance scaling.
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 --timeout 0 main:app
The reference requirements.txt
document prepares the Python environment with the provided execution frameworks.
Flask==2.1.0
gunicorn==20.1.0
The reference requirements.txt
provides a bit of insight into Google’s engineering for Python services. In addition to the Flask framework for HTTP services, they encourage the use of gunicorn as the server. That seems like another fine decision that saves me effort, again.
To complete the deployment of the initial service implementation, it needs to be pushed to Google Cloud. This is a job for Google Cloud CLI, within a PowerShell window.
gcloud run deploy
Google’s quickstart guide spells out the answers you’ll need for the possible deployment prompts, and ...
Then wait a few moments until the deployment is complete. On success, the command line displays the service URL.
If everything is hooked up, you’ll get a nice “Hello, world” response when you visit the provided service URL.
Use the OpenAI API
At this point, there is a simple service that can be deployed to the cloud. The next step is adding an endpoint that uses that snippet of code that calls the OpenAI completion method.
A new query endpoint implements this HTTP request. It transforms the HTTP request into a call to the openai.Completion.create()
method.
The use of the OpenAI API requires an import of the package’s definition. The import openai
statement brings an installed package into the Python interpreter.
The openai
package still needs to be installed into the Python environment with pip
. Adding the openai
package to the demo service’s environment is a simple process. Append the openai
library and its version into requirements.txt
.
openai==0.27.7
And rebuild the entire set of required libraries.
pip install -r requirements.txt
This package installation integrates well with the deployment process since the changes are captured in the requirements.txt
. It also aligns with Python conventions for environment maintenance.
With the API available to the demo service, it is time to address the transformation of the HTTP request. In a production service, any text input would require validation and escaping. There might be some dialogue between the user, and perhaps some checkboxes that capture the user's interests or their goals.
The immediate goal is to send any prompt to the API. The checkboxes can be generalized into an advanced prompt generator at a later date. For simplicity, the transformation process here directly uses the request’s query string as the prompt text.
Putting these changes all together, the following lines are added to the demo service’s implementation.
import openai
from flask import request
@app.route("/query")
def query():
"""Simple query with response from OpenAI"""
prompt = request.query_string.replace('+', ' ')
response = getResponse(prompt)
return openai.Completion.create(prompt=prompt,
engine="text-davinci-002", temperature=0.4, max_tokens=1024)
After making these changes, launching the demo service locally, and with great anticipation, a test of the local http://127.0.0.1:8080/query endpoint is expected to provide an initial response.
That initial response is an AuthenticationError page. It comes with a whole bunch of details and identifies the cause of the difficulties.
openai.error.AuthenticationError: No API key provided. You can set your API key in code using ..
Setting the API Key Locally
Getting an OpenAI API key requires setting up a paid account with OpenAI. Once the account is established, you can request access to the API. Early demand seemed to be very high. It took me almost 6 weeks to get access, but I kept busy with other projects.
Once you obtain access to the API, getting the keys is straightforward. The OpenAI endpoint https://platform.openai.com/account/api-keys is an interface to create and manage the keys for your account. You’ll want to make a copy of the value, but be sure to keep any permanent copies in secure documents.
For simple local testing, it’s tempting to simply write in the API key. It’s on your local machine, you have good physical security for it, and the threat is very low.
Adding the API key assignment in the local code allows you to run the service locally. The query endpoint will pass the authorization check, and you can confirm that the string is correct. Let’s run the experiment locally and confirm some essential details at a very low cost.
openai.api_key = ".. value provided by OpenAI"
Although this code confirms that the value is correct, this code and the API key can never leave the local machine. Pushing this experimental code to Git would widely expose the secret. Another approach is required, even for a demo service.
One common solution for passing secrets is to provide them as environment variables. The service can acquire the secrets that it needs, but they are never in the service’s code or other public artifacts. The management of the secrets becomes somebody else’s problem, and the service can do what it needs to do.
Python’s os.getenv()
method provides access to the environment variables, so the assignment of openai.api_key
should be changed to use that value.
openai.api_key = openai.api_key = os.getenv("OPENAI_API_KEY")
For testing purposes, the environment variable OPENAI_API_KEY
needs to be set in the context of the VSC debugger. That context is maintained by the Python Debug Console, a PowerShell process. You can type in an environment variable definition to the TERMINAL pane.
$env.OPENAI_API_KEY = ".. value provided by OpenAI"
Setting an environment variable in the Python Debug Console and accessing that value through Python code ensures the safe handling of the secret. The console and its state disappear anytime VSC is restarted so that use does not persist. The value is not present in any of the code. The development and execution environments run the demo service as expected without exposing the API.
Handling the Response
With the environment variable holding the correct API key, and the code using that variable to set the authorization key, the locally executing demo service can finally complete the expected behaviors. Tests of the http://127.0.0.1:8080/query endpoint will successfully send requests and display the response from the OpenAI text completion endpoint.
The response is considerably more than a string of tokens. The full response is a complex object with multiple nested properties. The Python/Flask framework displays these objects as JSON objects. A normal request and response pair take the following form.
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"text": "\n\nYes, we are good."
}
],
"created": 1687297904,
"id": "cmpl-7TdYG60bn9UkoCi3bSTjwEPpNOUnX",
"model": "text-davinci-002",
"object": "text_completion",
"usage": {
"completion_tokens": 8,
"prompt_tokens": 6,
"total_tokens": 14
}
}
For this simple query, the primary result is the text Yes, we are good.
. This value is available within the response.choices[0].text
property. Curiously, the desired text is prefixed with two new lines (\n\n
). Some services may want to remove it. For HTML, it’s simply ignored leading whitespace. More sophisticated services can use the other properties to coordinate advanced features.
Setting the Service API Key
Now that the service runs locally, the demo service can be deployed to Google Cloud Run.
The Dockerfile configures the Python service’s execution environment via pip
, and pip
prepares the execution environment from requirements.txt
. Since the revised demo service requires openai
, that dependency needs to be added to the execution environment.
Flask==2.1.0
gunicorn==20.1.0
openai==0.27.7
The revised service can now be deployed to Google Cloud run.
gcloud run deploy
Without the addition of openai
in requirements.txt
, the service would fail to start due to a missing dependency.
However, if you hit the /query
endpoint on the newly launched service you hit the same AuthenticationError page from earlier. You have to provide the API's key secret in the Google Cloud Run execution environment.
It’s easy to be led astray with the Google recommendations for secret management. Their Secret Manager can tightly control where a secret is used, and which people can see the secret. As they point out “environment variables are visible to anyone with project Viewer permissions or greater”.
For an initial demo service with limited development work, the Secret Manager is a bit overkill. A full production service should adopt these practices, but your initial service can stick with environment variables.
Setting an environment variable for a Google Cloud Run container is well described in their online documentation [https://cloud.google.com/run/docs/configuring/environment-variables].
On the Cloud Run page [https://console.cloud.google.com/run], select the service to see the details page. From here, the Edit & Deploy New Revision button takes you to the proper configuration page. Environment variables are toward the bottom.
Once the new environment variable is configured, you’ll need to relaunch the server. There is a button for relaunching the service at the bottom of the page.
With all these parts put together, sending a request to the hosted service should produce results similar to your locally hosted service. Since it is a service, the OpenAI connection can be shared widely. You’ve completed a core objective, the construction of a service that handles requests for OpenAI text completion.
Saving the Source Code
Now that there is a working service, it is time to publish the source code. In the first consideration, a public repository creates a durable version of the code that is independent of your hard drive. It also allows others to contribute to the service.
Assuming you have an existing GitHub account, creating the new repository is straightforward. If it is part of a larger project, you’ll want to create the project within an organization and not a personal repository. Creating the repository within an organization will simplify sharing across the organization.
If you have a completely new development environment or GitHub account, you will also want to configure the SSH authentication keys. SSH authentication keys allow git commands to work without providing explicit credentials.
Configuring SSH authentication keys requires accessing your account details. From the account profile, select the account image (“avatar”). Under the Access section on the sidebar, select the option SSH and GPG keys [https://github.com/settings/keys]. The provided links for generating SSH keys supply good instructions for configuring SSH authentication keys.
After creating the repository, you’ll need to clone it onto your local machine. The clone will be empty or mostly so. Depending on your creation choices and the current GitHub preferences, there may be some starter context such as README.md
or .gitignore
files.
git clone service-repo
cd service-repo
git config user.email <your-email>
git config user.name "Your Name"
The config
steps smooth the flow of git operations, especially git commit
commands. If you consistently work with the same organization and the same email identity, these configuration options can be set at the user level with the --global
flag.
With the repository clone in place, you’ll want to copy the current source files into a repository directory. If the current source files are under a helloworld
directory, this can be achieved with a simple tree copy. After the copy, the repository should have four files (main.py
, Dockerfile
, .dockerignore
, requirements.txt
)
From here, the new files are added to a commit and pushed to GitHub.
git add -A
git commit -m "Initial OpenAI service."
git push
For an established repository, these changes might use a branch to bring changes into the existing service. Since this commit is the first contribution to an empty repository, you can use a lighter-weight process.
Next Steps
With a working service for connecting web requests to the OpenAI API, there are lots of directions to pursue.
The Python API provides many other services beyond text completion. There is an endpoint that returns the list of supported models. Other endpoints provide support for moderation or chat behaviors. Preparing prompts and parsing the response from all these choices offers endless opportunities.
More advanced directions include ChatGPT plugins and query tuning with embeddings and vector databases. These can be used to integrate specialized knowledge into the foundation provided by the OpenAI products.
Closing
Stripped down to its basics, a Python service that runs as a Google Cloud Run service and passes requests to the OpenAI API is less than 20 lines of code.
import os
import openai
from flask import Flask
from flask import request
app = Flask(__name__)
openai.api_key = os.getenv("OPENAI_API_KEY")
@app.route("/query")
def query():
"""Simple query with response from OpenAI"""
prompt = request.query_string.replace('+', ' ')
response = getResponse(prompt)
return openai.Completion.create(prompt=prompt,
engine="text-davinci-002", temperature=0.4, max_tokens=1024)
if __name__ == "__main__":
app.run(debug=True, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))
The evolution of these lines and getting them to execute in a standard server context requires a considered effort. Weaving together a context that makes the API useful involves tying together a lot of loose ends.
Those loose ends can be a surprise and a challenge when they arise unexpectedly. Although it is difficult to eliminate that tie-up work, it is much easier when you know what to expect. This summary of the development and deployment experience can help you avoid surprises and offers some solutions to likely challenges.