Hello prompt engineers,
Previous blogs explained how to add droidcon session favorites to a database and also cache the embedding vectors in a database – but what if we stored everything in a database and then let the model query it directly?
The OpenAI Cookbook examples repo includes a section on how to call functions with model generated arguments, which includes a python demo of a function that understands a database schema and generates SQL that is executed to answer questions from the chat. There’s also a natural language to SQL demo that demonstrates the model’s understanding of SQL.
In this post, we’ll implement the model generated arguments function call in Kotlin, using the droidcon SF 2023 conference schedule data as the source.
Figure 1: JetchatAI droidcon SF demo answering questions with dynamically generated and executed SQL
SQLite on Android
Just like the previous posts on storing favorites and vectors, we’ll use the Android Sqlite APIs described in the the SQLite training documentation. Unlike those functions, however, we’ll be using the rawQuery()
function rather than building up “strongly typed” queries because the SQL will be generated by the model.
As the example explains, this can be dangerous in production applications, as the model is not perfectly reliable at generating valid SQL, so there’s a possibility that it’ll generate code that won’t run (or an even more remote possibility that it could generate destructive queries). It could also expose data you don’t want the user to see (from system tables, internal tables, or data they shouldn’t have access to).
In our use case, we’ll only be executing against a read-only database API, so there’s no risk of data loss, and there’s only a single-user’s favorites stored so no privacy concerns either.
Creating the database
See the data/DroidconDatabase.kt file for the complete database schema and data loading functionality. Since this is a sample focused on AI and not SQL, you’ll notice the schema and code is somewhat simplistic. In a production application, you would spend more time on a robust data strategy.
Creating an SQL query function
We’ll start by creating the call function class, using the template from the save favorites blog post. You can view the result in the functions/AskDatabaseFunction.kt file, and the descriptive properties are shown in Figure 2:
class AskDatabaseFunction { companion object { fun name(): String { return "askDatabase" } fun description(): String { return "Answer user questions about conference sessions like what room sessions are presented in. Output should be a fully formed SQL query." } // ...
Figure 2: Starting point for the AskDatabaseFunction
Now we need to figure out what the parameters are, and then how to handle the function when called.
Teaching the model about a schema
Exposing the table schema to the chat model via a function can enable it to generate its own queries. The ideal way to achieve this is by inferring the actual database schema and feeding it to the model as a function parameter. Because we have a simple data structure in the droidcon demo, it was quick and easy to build a ‘proof of concept’ by hardcoding the schema.
Figure 3 shows the hardcoded schema description along with prompt prefix and suffix that guides the model in how to interpret it.
fun params(): Parameters { val params = Parameters.buildJsonObject { put("type", "object") putJsonObject("properties") { putJsonObject("query") { put("type", "string") put(""" SQL query extracting info to answer the user's question. SQL should be written using this database schema: Table: sessions Columns: session_id, speaker, role, location_id, date, time, subject, description Table: favorites Columns: session_id, is_favorite Table: locations Columns: location_id, description The query should be returned in plain text, not in JSON.""".trimIndent() ) } } putJsonArray("required") { add("query") } } return params }
Figure 3: Create a hardcoded schema to test the function and the SQL generated by the model
The class is complete enough to do some debugging – add the following code to the chatCompletionRequest
and you can use logcat or debugging to view the SQL that the model generates in response to your questions.
function { name = AskDatabaseFunction.name() description = AskDatabaseFunction.description() parameters = AskDatabaseFunction.params() }
Figure 4: Adding the hardcoded function params allows us to test the model’s responses
If you plan to manage your database schema carefully, you could probably get away with shipping a hardcoded schema like this. It would also make it easy to omit some tables or columns, preventing the model from exposing some data.
Teaching the model with the latest schema
To fully implement the OpenAI cookbook example we need to dynamically generate the schema using system tables. This will make the code less fragile and prevent a hardcoded schema from getting ‘out of sync’ with any changes made in DroidconDatabase.kt. By dynamically generating the tables and columns each time, the model will always be getting the latest schema that reflects exactly what’s present in Sqlite.
To generate the schema, we need to query sqlite_master
and then the table_info
to get the table and column names, then format them into a structure like the hardcoded version shown above. Figure 3 shows the function that uses Sqlite system tables and outputs the schema shape as text:
fun generateSimpleSchema(): String { val db = readableDatabase var out = "" val tableCursor = db.rawQuery("SELECT name FROM sqlite_master WHERE type='table'", null) with(tableCursor) { while (moveToNext()) { val tableName = getString(0) // table name out += "Table: $tableName\nColumns: " val columnCursor = db.rawQuery("PRAGMA table_info('$tableName');", null) var needComma = false with(columnCursor) { while (moveToNext()) { val columnName = getString(1) // column name if (needComma) out += ", " else needComma = true out += "$columnName" } } columnCursor.close() out += "\n\n" } } tableCursor.close() return out }
Figure 5: Use the system tables to generate the database schema
To use the dynamically generated schema, call db.generateSimpleSchema()
in the params
method and use the result in the query.
Now that the model has access to the latest database schema, all that remains is to execute the queries that it generates.
Dynamic SQL results
As noted above, we’re showing a lot of trust in the model by executing its SQL queries directly in our database. The code is shown in Figure 6 – notice the first line creates a read-only access to the database and then we execute the raw query directly. The rest of the function is just string formatting to return the columns in a specific format – in this case mimicking the OpenAI demo which uses a layout like this: [('abc', 1),('def', 2)]
:
fun function(dbHelper: DroidconDbHelper, query: String): String { val db = dbHelper.readableDatabase val cursor = db.rawQuery(query,null) // execute the query! var rowCount = 0 var out = "" var needOuterComma = false with(cursor) { out += "[" var needComma = false while (moveToNext()) { if (needOuterComma) out += "," else needOuterComma = true out += "(" for (i in 0 until cursor.columnCount) { if (needComma) out += "," else needComma = true out += when (getType(i)) { FIELD_TYPE_INTEGER -> getString(i) else -> { "'${getString(i)}'" } } } out += ")" rowCount++ } out += "]" } cursor.close() return if (out == "") { "0 rows affected" } else { out } }
Figure 6: Chat function class that can execute SQL directly on the local database
The model seems to have no trouble understanding the output format and matching the tuples with the query that it created, so it can generate responses that correctly interpret complex return datasets.
How the model uses SQL querying capabilities
The droidcon SF chat demo already has a couple of mechanisms for interacting with conference data:
- Conference metadata is hardcoded in the system prompt
- Embeddings are used to for retrieval augmented generation
- There’s a data-driven function to return sessions by time
- The favorites feature has its own insert/select/delete functionality over the data
When we add the SqlQueryFunction
to the mix, the model might not always choose the most effective way to answer user queries. When all these options are available, the AskDatabaseFunction
gets called mostly for queries like “how many sessions is craig presenting?” or “how many sessions are in Robertson 1?”, e.g.:
Figure 7a: the types of questions that could get routed to the SQL querying function
SELECT COUNT(*) FROM sessions WHERE speaker = 'CRAIG DUNN'
Figure 7b: example query from logcat
For a query like “how many sessions are in Robertson 1?”, even though the database query would produce a better result, the model calls the SessionsByTimeFunction
which gives a partial answer (only the sessions in Robertson 1 within the default time parameters, so in this case it answers “2 sessions”).
Other questions that are possible include “what rooms are there?” and “how do I get to fisher east”:
Figure 8a: more questions that can be answered by the AskDatabaseFunction
Which result in the following queries and responses:
SELECT DISTINCT location_id FROM locations [('Fisher East'),(,'Fisher West'),(,'Robertson 1'),(,'Robertson 2')] SELECT directions FROM locations WHERE location_id = 'Fisher East' [('Outside off the courtyard')]
Figure 8b: the generated SQL and query result for the examples in Figure 9a
With the ability to generate and execute SQL the model can answer many different types of questions beyond the embeddings-based version that was originally presented at droidcon SF 2023.
Resources and feedback
The Jetchat sample with the droidcon schedule is available at github.com/conceptdev/droidcon-sf-23.
We’d love your feedback on this post – if you have any thoughts or questions, use the feedback forum or message us on Twitter @surfaceduodev.
There will be no livestream this week, but you can check out the archives on YouTube.
0 comments