This project aims to address the challenges of translating geospatial natural queries into SQL code by developing a fine-tuned LLM that excels in translating natural language queries into SQL for geospatial relational databases. A key component of the project is the creation of a Brazilian Portuguese dataset specialized in geospatial queries, tailored to the domain of the Cultura Educa project. This dataset will include diverse examples of geospatial queries involving functions like `ST_Contains`, `ST_Buffer`, and `ST_Within`, reflecting real-world applications. By incorporating these examples, the project seeks to improve the model's ability to handle the syntactic and semantic intricacies of geospatial SQL commands, particularly in Portuguese, where resources are even more limited.
If you'd like to learn more about this project, feel free to reach out via LinkedIn.