Automate GTM with OpenAI, Python & Streamlit

Using GPT-3 chat models through the OpenAI API to generate sales and marketing data.

Dave Melillo
10 min readJan 31, 2023

Table of Contents:

Introduction 👋

OpenAI 🤖

The Scenario 🎬

Know Thyself 🧠

The Prompt 📜

The API 💻

The Results 🔢

TLDR; This post will focus on general application of OpenAI, Python and Streamlit. All of the loops, CSV processing, Streamlit formatting and Python code can be found in detail here. You can also access the Streamlit app here.

DALL-E generated images of “AI Robot Sales Person” … only slightly terrifying

Introduction

I have been adjacent to sales and marketing in my recent roles, and it’s given me a unique opportunity to chart the impact modern data techniques have had on Go-To-Market (GTM) functions.

It started innocently enough, with data providing an extra layer of context, insights and analysis on top of an ever-growing sales and marketing tech stack.

This evolved into data taking center stage, enabling complex marketing attribution models, feature rich segmentation, and prediction models to help sales/marketing prioritize their efforts.

We are now at a point that if you aren’t a data driven GTM organization, you are probably falling behind in every other business metric. Automation, AI and DataOps have changed GTM forever (for good), and the revolution is nowhere near complete as new technologies such as OpenAI are introduced.

OpenAI

It’s been interesting to read everyone’s reaction to OpenAI. There are many, including myself, who immediately dismissed it as another trendy AI tool that would disappear into obscurity soon enough. But I’ll be the first one to admit how wrong I was.

I was initially inspired to work with the OpenAI API by a popular post from Lucas Perret that walked people through integrating Google Sheets and the Open AI API with a tool called n8n.

Over the weekend I started to step through Lucas’ demo, but hit a few walls trying to work within n8n’s framework. So I set out to take Lucas’ core idea and replicate it using Python. My final product is actually a Streamlit app that requires absolutely no code to operate, but I am hoping this post can inspire those interested in OpenAI at all levels.

The Scenario

We’re going to take the perspective of a sales/marketing team trying to scale their outbound prospecting efforts.

An obvious assumption in sales is that the more personalized your communication, the more effective it will be. This is more than an assumption, as companies like ClearBit and ZoomInfo have made millions offering sales intelligence as a service.

What Lucas made abundantly clear is that you can use the OpenAI API to enrich prospects with helpful information such as their value proposition, industry, target audience and whether the company is B2B or B2C.

I took Lucas’ initial enrichment a little further, experimenting with OpenAI to generate messaging that can be combined with the information above to create a personalized prospecting experience at scale.

Know Thyself

Before we spend all this time generating info about our prospects, it’s important that we also understand our own business. There’s a number of ways to do this, but my approach was to leverage easily accessible web content.

During this example, I am going to SELL FROM the perspective of a great company called Hightouch. I am not officially affiliated with Hightouch and I have never worked for them, but I am using them for a couple of reasons:

  1. I love the product. I was an early adopter and I have been a customer for a long time. I think Hightouch does reverse ETL/data activation better than anyone.
  2. As a data professional, I understand Hightouch’s value proposition from a business and technical perspective. This allows me to gauge the appropriateness of the features and messaging OpenAI is generating, which is something I couldn’t do in an unfamiliar sector.

With that said, we get to our first piece of code, which has nothing to do with OpenAI. The code below accesses a website (i.e hightouch.com) , parses and cleans the HTML response, and returns the first 1,000 characters.

import requests
from bs4 import BeautifulSoup

#format sell from website
sell_from = 'https://' + sell_from
#get content
page = requests.get(sell_from)
#transform content
soup = BeautifulSoup(page.content, "html.parser")
#clean content
souper = soup.text
souperx = souper.replace("\n", "")
#set prompt input from clean content
selling_values = souperx[:1000]

The output (selling_values) for hightouch.com looks like this:

This gives OpenAI enough content and keywords to create a relevant message FROM Hightouch, regardless of the information we gather about the prospect companies.

The Prompt

The prompt is what drives everything in the OpenAI completions endpoint. These are the words/questions/context you feed into OpenAI to produce results. It’s the programmatic equivalent of going to chat.openai.com and typing in a prompt manually to ellict a response.

We already have the selling_values, which gives us context about the SELL FROM company from their website.

We repeat those same steps for the prospect(s), creating a variable called prospect_values which gives us context about the SELL TO company from their website.

Armed with contextual information about the SELL FROM and SELL TO companies, we can use that data to build the prompt.

prompt = f"""
This is the content of the prospect company website {prospect_values}

This is the prospect company website {sell_to}

This is the selling company's product offering: {selling_values}

In a JSON format:

- Give me the value proposition of the prospect company. In less than 25 words. In English. Casual Tone. Format is: "[Company Name] helps [target audience] [achieve desired outcome] and [additional benefit]"

- Give me the industry of the prospect company. (Classify using this industry list: [Energy, Materials, Industrials, Consumer Goods, Health Care, Wellness & Fitness, Finance, Software, Communication, Entertainment, Utilities, Agriculture, Arts, Construction, Education, Legal, Manufacturing, Public Administration, Advertisements, Real Estate, Recreation & Travel, Retail, Transportation & Logistics])

- Guess the target audience of each prospect company.(Classify and choose 1 from this list: [Sales Teams, Marketing Teams, Product Teams, HR teams, Customer Service Teams, Consumers, C-levels, Data Teams, DevOps Teams, Programmers, Finance Teams])

- Tell me if the prospect company is B2B or B2C.

- Include the original prospect company website mentioned above.

- Write an email message from selling company to prospect company, aligned with selling company's product offering. This email should attempt to sell prospect company the selling company's product.In less than 125 words. In English. Casual, Quirky Tone."

format should be:
{{"value_proposition": value_proposition,
"industry": industry,
"target_audience": target_audience,
"market": market,
"website": website,
"short_email": short_email}}

JSON:
"""

Again, I would never have gotten to this point without going through Lucas’ post first. He illustrated how feeding a structured prompt like this into OpenAI produces a JSON object that can be easily converted into a dataframe. Let’s break the prompt down a little further.

The first part of the prompt sets variables and context for the rest of the prompt. This is where we take advantage of the prospect_values and selling_values we scraped from the web.

This is the content of the prospect company website {prospect_values}

This is the prospect company website {sell_to}

This is the selling companys product offering: {selling_values}

Points 1 through 4 below were in Lucas’ original post. I extended them a bit by adding more industry values to the list of industries and adding new target audiences to the list of target audiences.

I was so impressed by this, as I understood the effort it takes to produce these results with traditional ML classification and Natural Language Processing.

I was also impressed by the level of customization that can be achieved, by changing the value proposition format or providing different industry/target audience classifications specific to your business.

1 Give me the value proposition of the prospect company. In less than 25 words. In English. Casual Tone. Format is: "[Company Name] helps [target audience] [achieve desired outcome] and [additional benefit]"

2 Give me the industry of the prospect company. (Classify using this industry list: [Energy, Materials, Industrials, Consumer Goods, Health Care, Wellness & Fitness, Finance, Software, Communication, Entertainment, Utilities, Agriculture, Arts, Construction, Education, Legal, Manufacturing, Public Administration, Advertisements, Real Estate, Recreation & Travel, Retail, Transportation & Logistics])

3 Guess the target audience of each prospect company.(Classify and choose 1 from this list: [Sales Teams, Marketing Teams, Product Teams, HR teams, Customer Service Teams, Consumers, C-levels, Data Teams, DevOps Teams, Programmers, Finance Teams])

4 Tell me if the prospect company is B2B or B2C.

5 Include the original prospect company website mentioned above.

6 Write an email message from selling company to prospect company, aligned with selling companys product offering. This email should attempt to sell prospect company the selling companys product.In less than 125 words. In English. Casual, Quirky Tone."

Point 5 and 6 are what I added to the prompt. I had to play around with the email message portion quite a bit to get acceptable results, but the end result is a “Quirky” and “Casual” message, no longer than 125 words that leverages all of the contextual data we provided from the sell_to and sell_from company websites.

The API

I am just beginning my work with the OpenAI API. You should definitely read the docs to get a complete understanding of the capabilities here, but I will speak to some of the choices I made in the code below:

# Authenticate 
openai.api_key = openai_key
# Set up the model
model_engine = "text-davinci-003"
# Generate a response
completion = openai.Completion.create(
engine=model_engine,
prompt=prompt,
max_tokens=1024,
n=1,
stop=None,
temperature=0.5,
)
#access response text
response = completion.choices[0].text
#transform to json/dict
json_object = json.loads(response, strict=False)
#append to empty df
finaldf = finaldf.append(json_object, ignore_index=True)

I chose the Davinci model as it seemed like the most extensive and accurate model compared to the Curie, Babbage and Ada options. It is also the most recently trained model (June 2021). The description of text-davinci-003 from the docs really sells itself:

Most capable GPT-3 model. Can do any task the other models can do, often with higher quality, longer output and better instruction-following. Also supports inserting completions within text.

Another important parameter is temperature. You can think of this as the creativity aperture for OpenAI. As described in the docs, “Higher [temperature] values means the model will take more risks”.

A different way to configure the creativity allowance of the model is to use the top_p parameter, which considers token inclusion based on their probability mass.

Either way, I kept the temperature at .5 to strike a balance between accuracy and creativity.

The Results

Let’s step through the Streamlit app to see the culmination of this code and the eventual output.

Step 1: Gather the SELL FROM data. We are using Hightouch as our SELL FROM company in this example. If you are actually using this app, don’t forget to bring your own API credentials!

Step 2: Upload a CSV list of prospect domains. The input format is very specific but also very simple. Images of the CSV and processed CSV in Streamlit are below.

CSV INPUT FILE
CSV Processed Example

Step 3: When the SUBMIT button is initiated, all of the prospects uploaded in Step 2 are converted to a list and processed through a loop that creates the OpenAI prompt and processes the API call. The results for each company are appended to a new dataframe for export.

Step 4: The results dataframe from Step 3 is converted to a CSV file and made available for download through Streamlit. The results of our example are below.

The value proposition, industry, target audience and market are very accurate. I have tested this on longer lists with diverse companies of all sizes, in all types of industries, and the results remained surprisingly consistent.

The messaging results vary from case to case, but at the very least they give us a starting point for relevant communication with our prospect.

One of the better results is the message OpenAI put together for https://fandom.com:

Hey there! We’re Hightouch, a platform that helps you sync data to any SaaS tool with just SQL. No APIs. No CSVs. We’d love to help you activate your data and make your fan-created content and communities even more amazing. What do you think? Let us know!

Notice that the selling_values were used as the base of the message (just SQL. No APIs. No CSVs), with allusions to the prospect_values as well (make your fan-created content and communities even more amazing)

OpenAI nails the “Quirky” tone with a “Hey there!” to begin the message and a “What do you think? Let us know!” to end things. Experimenting with the tone, length and content of this message is easy with simple text inputs in our prompt. It requires some trial and error, but doing this at scale/speed with classic ML/NLP would be a daunting task.

The End?

From here the story is yours! Just as I took inspiration from Lucas Perret’s original post, I hope this inspires you to start working with OpenAI. The buzz around OpenAI and what it can do reminds me of the early days of modern data, where there were more ideas than people. During these times, when innovation is driven by the pure joy of discovery, quantum leaps in technology, business and culture are achieved.

If you happen to extend this code or if you start using the Streamlit app for your own GTM purposes, I would love to hear from you!

Thanks for reading.

--

--

Dave Melillo
Dave Melillo

Written by Dave Melillo

The Full Data Stack! Data Engineer, Data Architect, Data Scientist ++ practical application of data science 🛠

No responses yet