In this episode of The Check with Joseki Tech, the team walks through enhancements to HubBoost, a proprietary AI solution for sales and marketing email generation.
Key points include:
Introduction of Batch Processing: Introduction of new features that allow batch processing of customer data, moving away from single-instance adjustments to more scalable modifications.
Demonstration of New HubBoost Features: Showcase of how specific contacts are selected and processed in parallel to test improvements in real-time for a more efficient iteration process.
Future Improvements and Fine-Tuning: Future plans to implement AI fine-tuning techniques to further personalize and enhance the accuracy of generated content while considering the technological limitations and capabilities of different AI models.
I: INTRODUCTION & MOTIVATION
Adam Steidley
Hi everybody. Welcome to The Check with Joseki Tech. I'm Adam Steidley, the President of Joseki Tech.
Dan Bush
Hi, I'm Dan. I'm the VP of Application Development at Joseki Tech.
Jasmin Salas
Hey guys, I'm Jasmin. I am the Director of Marketing here at Joseki Tech.
Adam Steidley
Today, we will look at some new features Dan whipped up for us to better tune our gen AI sales and marketing email generator. So, by way of motivation before we dive in here, what we've looked at in other episodes is before and after one change. So, last week, we looked at adding the skills and the positions for the customer into the data mix to see the result. We've looked at changes to the knowledge base and the agent's starting instructions. But each of these tunes has been one step at a time. We've been showing we can build the plumbing needed for this job.
Now that we've got that baseline set, we must accelerate how quickly we can tune things. Looking at one example at a time isn't enough. We wanted to find a way to do these in batches where you could take the same group of customers, look at a result, tune something, repeat, repeat till we're happy with tuning. Then, we can move on to another batch of customers to see if we're happy with the result. So, Dan, you want to show us what we got?
II: DEMO OF NEW FEATURES
Dan Bush
Okay, we have a standard spring boot test case here. The line I have commented on here was the first approach where I picked three random contacts, proceeded with those, and did the generation. Then, I reached the point where I wanted to pick specific contacts. So what you're seeing here is three specific contacts picked.
Adam Steidley
When doing iterations of this, it will be important to look at the same records repeatedly to see how they improve. So, random sampling is good for a quick test. But for starters, we want to have a consistent set every time.
Dan Bush
So what we'll do here is play this.
Adam Steidley
Now, we didn't put an interface around this because, you know, I've got IntelliJ on my machine, and I'm the guy that will be running it principally. So, doing this was as easy as more team members used it. I will probably put some web interface around this, but that's pretty mechanical.
Dan Bush
Yep, the teams are already working on a web app and an administrative backend, so we should be able to set you up a UI there. You'd have asked me this before at some point, but we are generating these in parallel, so it's not serious. One, two, three. Reaching out to OpenAI. We have a standard Java executor service here under the covers. So, all three of those went in parallel. It usually takes 20 to 30 seconds. One time around. We're only as slow as our fastest man, fast as our slowest man. Let's take a look at this and open it in a browser so you can see what it looks like.
It's rudimentary to start with here. We've only picked three contacts, so I laid them out on the screen so you could see them. I also included some quick links so you could jump down to one. When you jump down to it, you can see the first name, the contact, the subject line generated, and the content generated for them. Then, if you're curious, you can extend or drop down the prompt accordion here to see the exact prompt that was fed into AI.
Adam Steidley
So that'll let you see what the data was around this customer passed in.
Dan Bush
Looking closely, you can see that the skills and the positions were tacked in at the end.
III: ANALYSIS & AUTOMATION
Adam Steidley
Excellent. So, we will be doing rounds of 25 to 50 at a clip and marching through all the critical elements we have in our ecosystem. Looking more closely at our starting instructions, one place we've got a lot of area for improvement is in our knowledge base. We only have five or six pages of text about Joseki Tech, and we need to bulk that up so we have more information about our products and our skill set from which it can pull. Then there's some playing we'll wind up doing about the tone of the message to ensure we get an example that we like well. That will all come together to get a better and better email with each pass.
Dan Bush
Yep, yep. We have all that on the horizon. Jasmin is working on giving us better best practices from a knowledge base perspective, and next week, we will also meet with industry experts to review our prompting approaches.
Jasmin Salas
A question: Once you've generated them in batches, the best way to send these out is to copy and paste them individually into the hub and boost them into the email section? Or what is the go-to way we send out these email batches? Or would we theoretically send out these batches of emails we are making? As we saw previously, it may have all this great information, but it always needs a tweak. It will need a little tweak, especially if you make them in batches like each.
Adam Steidley
Yeah, that's a good question, Jasmin. We're still really thinking that for us, we're going to send them out through that same interface with HubSpot, where someone is looking at it, giving it a final read, and tweaking it with some feedback before shipping it. This is as a baseline, not so much for sending out the batch, but for tuning the engine so that when we're tuning, we're not doing one at a time; we're doing 25 at a time. So we can see broader things and where those are going. There's an interesting use case for sending. So we can see more of the errors at once. We can see when it does go off the rails if we want to send these in batches, which, for our use case, is not something that we would do.
Other customers may want a fully automated solution—the line between when it is a marketing email and when it starts to be a sales email. You try to push it so that there's one more marketing email, namely generated, and not a real person there, but it's starting to feel more like your sales guy is interacting with you. So, from that perspective, we'd want to run this tool and run through 100 examples, read all of them, and make sure none look distasteful. We don't see hallucinations; we don't see anything that looks blatantly an error or bad. Once we can do 100 without an error, we will be comfortable automating it to send thousands. But you still want to keep an eyeball on it.
Dan Bush
When we get to scale, it is what it is. You can't eyeball 100,000 of them. So we can have a human in the loop for a certain percentage of cases, and then we should think about doing what the industry is doing—we use AI to do AI. If you look at how ChatGPT works, after it generates something, it gets shoved right back into ChatGPT to scrub it for profanities and whatnot. So we could do the same thing here. We could have another assistant or AI loop that does that—it checks it.
Jasmin Salas
Interesting concept.
IV: FINE-TUNING & FUTURE IMPROVEMENTS
Adam Steidley
This will be a good core of our process of improving things. There is one other aspect we haven't discussed yet, which we will look at in a future pod, and that's fine-tuning. So, Dan, can you give us the basics of fine-tuning?
Dan Bush
Yeah, and we're marching towards that end here. So the deal with fine-tuning is you take a model like a ChatGPT -3 or -4 Turbo, and you want to teach it some of your special sauce. You want to have that baked in. So you'll take a couple hundred samples and then run them through. You'll generate a couple hundred samples, and you'll weigh them. You'll tell the engine, feedback-wise, whether they were good or bad. We'll upload those and use them, as well as a feature called fine-tuning, to generate a new model. It'll be a proprietary model there, or, you know, be a trained model, a tuned model at that point, and then we'll configure our system to use that model going forward.
Adam Steidley
Yeah, so we will use this bulk process first to get closer to where we see the result. Then, we'll apply some fine-tuning as the final step. What will be most interesting about that episode is that fine-tuning is not available in the latest and greatest version of ChatGPT. It's not in -4 Turbo, but it is available in -3.5.
Dan Bush
Yeah, there are certain models. I'm not 1000% sure right now, but the documentation indicates that tuning the -4 models is impossible.
Adam Steidley
So it'll be interesting to see.
Dan Bush
Right.
Adam Steidley
It's not going anywhere. We're currently using ChatGPT-4 Turbo for generating, and it's at best in beta for the fine-tuning features. Another interesting thing we'll be able to look at is how our model is fine-tuned in 3.5 versus how it is unfine-tuned in -4, which is interesting from a pricing perspective, as well as seeing how far along the models are going.
Dan Bush
Yeah, once we start fine-tuning and baking that stuff in, it's all about the tokens, right? We get charged by the token. So once we start baking that stuff into the model, we'll no longer have to include it in context or the knowledge base or, you know, sneaking into the thread in some other way. It'll just be baked in.
Adam Steidley
We'll see how that does for us. So, thanks, everyone, for coming by today, and we look forward to seeing you on the next one.