Surface Duo Blog

Build great Android experiences, from AI to foldable and large-screens.

Latest posts

OpenAI tokens and limits
Aug 24, 2023
Post comments count 2
Post likes count 2

OpenAI tokens and limits

Craig Dunn
Craig Dunn

Hello prompt engineers, The Jetchat demo that we’ve been covering in this blog series uses the OpenAI Chat API, and in each blog post where we add new features, it supports conversations with a reasonable number of replies. However, just like any LLM request API, there are limits to the number of tokens that can be processed, and the APIs are stateless meaning that all context needed for a given request must be included in the prompt. This means that each chat request and response gets added to the conversation history, and the whole history is sent to the API after each new input so that the co...

Prompt engineering tips
Aug 17, 2023
Post comments count 0
Post likes count 1

Prompt engineering tips

Craig Dunn
Craig Dunn

Hello prompt engineers, We’ve been sharing a lot of OpenAI content the last few months, and because each blog post typically focuses on a specific feature or API, there’s often smaller learnings or discoveries that don’t get mentioned or highlighted. In this blog we’re sharing a few little tweaks that we discovered when creating LLM prompts for the samples we’ve shared. Set the system prompt The droidcon SF sessions demo has a few different instructions in its system prompt, each for a specific purpose (explained below): Keep the chat focused The first part of the sy...

Dynamic Sqlite queries with OpenAI chat functions
Aug 10, 2023
Post comments count 0
Post likes count 0

Dynamic Sqlite queries with OpenAI chat functions

Craig Dunn
Craig Dunn

Hello prompt engineers, Previous blogs explained how to add droidcon session favorites to a database and also cache the embedding vectors in a database – but what if we stored everything in a database and then let the model query it directly? The OpenAI Cookbook examples repo includes a section on how to call functions with model generated arguments, which includes a python demo of a function that understands a database schema and generates SQL that is executed to answer questions from the chat. There’s also a natural language to SQL demo that demonstrates the model’s understanding of SQL. ...

Embedding vector caching (redux)
Aug 3, 2023
Post comments count 1
Post likes count 0

Embedding vector caching (redux)

Craig Dunn
Craig Dunn

Hello prompt engineers, Earlier this year I tried to create a hardcoded cache of embedding vectors, only to be thwarted by the limitations of Kotlin (the combined size of the arrays of numbers exceeded Kotlin’s maximum function size). Now that we’ve added Sqlite to the solution to support memory and querying, we can use that infrastructure to also cache the embedding vectors. Note that the version of Sqlite we’ll use on Android does not have any special “vector database” features – instead, the embedding vectors will just be serialized/deserialized and stored in a column. Embedding vector simil...

Chat memory with OpenAI functions
Jul 27, 2023
Post comments count 0
Post likes count 0

Chat memory with OpenAI functions

Craig Dunn
Craig Dunn

Hello prompt engineers, We first introduced OpenAI chat functions with a weather service and then a time-based conference sessions query. Both of those examples work well for ‘point in time’ queries or questions about a static set of data (e.g., the conference schedule). But each time the JetchatAI app is opened, it has no recollection of previous chats. In this post, we’re going to walk through adding some more function calls to support “favoriting” (and “unfavoriting”) conference sessions so they can be queried later. Figure 1: saving and retrieving a favorited session This will ...

Combining OpenAI function calls with embeddings
Jul 20, 2023
Post comments count 0
Post likes count 0

Combining OpenAI function calls with embeddings

Craig Dunn
Craig Dunn

Hello prompt engineers, Last week’s post introduced the OpenAI chat function calling to implement a live weather response. This week, we’ll look at how to use function calling to enhance responses when using embeddings to retrieve data isn’t appropriate. The starting point will be the droidcon SF sample we’ve covered previously: Figure 1: droidcon chat and the questions it can answer using the system prompt or embedding similarity As you can see, the droidcon chat implementation can answer questions like “when is droidcon SF?” (using grounding in the system prompt) and “are there any AI s...

OpenAI chat functions on Android
Jul 13, 2023
Post comments count 0
Post likes count 0

OpenAI chat functions on Android

Craig Dunn
Craig Dunn

Hello prompt engineers, OpenAI recently announced a new feature – function calling – that makes it easier to extend the chat API with external data and functionality. This post will walk through the code to implement a “chat function” in the JetchatAI sample app (discussed in earlier posts). Following the function calling documentation and the example provided by the OpenAI kotlin client-library, a real-time “weather” data source will be added to the chat. Figures 1 and 2 below show how the chat response before and after implementing the function: Figure 1: without a function to provide r...

Multimodal Augmented Inputs in LLMs using Azure Cognitive Services
Jul 6, 2023
Post comments count 0
Post likes count 1

Multimodal Augmented Inputs in LLMs using Azure Cognitive Services

Parker Schroeder
Parker Schroeder

Hello AI enthusiasts, This week, we’ll be talking about how you can use Azure Cognitive Services to enhance the types of inputs your Android AI scenarios can support. What makes an LLM multimodal? Popular LLMs like ChatGPT are trained on vast amounts of text from the internet. They accept text as input and provide text as output. Extending that logic a bit further, multimodal models like GPT4 are trained on various datasets containing different types of data, like text and images. As a result, the model can accept multiple data types as input. In a paper titled Language Is Not Al...

Embedding vector caching
Jun 29, 2023
Post comments count 0
Post likes count 0

Embedding vector caching

Craig Dunn
Craig Dunn

Hello prompt engineers, A few weeks ago I added a custom datastore (the droidcon SF schedule) to the Jetchat OpenAI chat sample. One of the ‘hacks’ I used was generating the embeddings used for similarity comparisons on every startup and caching in memory: This results in ~70 web requests each time, plus the (albeit low) monetary cost of the OpenAI embeddings endpoint. It is a fast and easy way to build a demo, but in a production application you would want to avoid both the startup delay and the cost! In this post I’ll discuss my first attempt building a vector cache on-device, and the...