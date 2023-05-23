Large language models (LLM) have the potential to transform the approach to data analysis and decision-making. As such, it's crucial to understand the short-term and mid-term implications of this new development.

As a cutting-edge application that leverages a large language model, ChatGPT is revolutionising the thinking about data and analytics. Its advanced capabilities in natural language processing and machine learning (ML) have already made it a valuable tool for various industries, from finance and healthcare to education and entertainment.

ChatGPT’s evolution is far from over. In the coming year, it’s expected that there will be further rapid progress in the development of ChatGPT and similar solutions, as well as the emergence of complementary technologies.

Limitations and risks As an application built on the GPT large language model, ChatGPT is well-suited for language-related tasks. However, its ability to perform mathematical operations is more limited. It is important to note that ChatGPT has limitations and is not suitable for many analytics tasks. Many use cases for ChatGPT in data and analytics will be to aid data engineers, data analysts and data scientists with tasks that involve programming systems and code generation. For example, a data engineer could ask a language model to generate data ingestion and transformation scripts, configuration templates, and SQL queries. A data analyst could use it to assist in generating DAX code for PowerBI, while a data scientist could use it to assist in reviewing Python code for machine-learning-related functions. The above usage, however, risks the exposure of restricted or sensitive information being passed to the company that builds the model. As a result, data and analytics practitioners should ensure that any confidential or proprietary material is properly handled by the LLM service provider. For ChatGPT that would be OpenAI or Microsoft Azure OpenAI Services. Another risk is that generated code or other output may be unreliable, therefore the use of these tools to augment the developer’s processes is the recommended approach. Data and analytics practitioners should take the lead in informing the risk and compliance policies related to using generative AI tools and act as subject matter experts when educating business stakeholders. Practitioners should apply the generated AI code in phases and highlight and monitor it against standard code quality controls or be subjected to the same review and testing as human-written code.