This is part 2 of a discussion recorded for the Computer Weekly Developer Network (CWDN) with Andrea Mirabile in his position as global director of AI research at Zebra Technologies – part 1 is linked here.
With over 10 years as an AI scientist, Mirabile is arguably quite suitably positioned to discuss LLMs with a deep understanding of how the technology works.
Zebra Technologies has established a global AI research network to examine the on-device execution of generative AI LLMs to empower front-line workers.
CWDN: A key question – what is prompt injection? According to Luis Minvielle writing on WeAreDevelopers, “Since LLMs look like they know what they’re saying but are actually just repeating words and probabilities, they carry biases and can share prankish texts.
Mirabile: Prompt injection refers to a security vulnerability that can be exploited to manipulate the behaviour of an LLM. This vulnerability allows an attacker to introduce malicious prompts into the system, compelling the model to perform unintended actions.
In a recent example reported online, researchers demonstrated a prompt injection scenario with ChatGPT. When prompted to repeat the word “poem” indefinitely, ChatGPT unexpectedly generated what appeared to be a real email address and phone number in its responses. This incident underscored the potential risks associated with prompt injection, as it unveiled elements of the model’s training data that were not intended to be exposed. Such instances highlight the importance of addressing and mitigating prompt injection vulnerabilities to safeguard against unintended data disclosures and privacy breaches.
Various techniques and methods are employed in prompt injection attacks, each designed to influence the model’s responses in specific ways:
Basic Injection: Directly sending attacks to the target without prompt enhancements to obtain answers to unrelated questions or dictate actions. For example, pretending to be the developer, influencing the target to act as something, leveraging attack types like Carnegie Mellon Jailbreak or Typoglycemia.
Translation Injection: Exploiting the language capabilities of LLMs by injecting prompts in languages other than English to test if the model responds accordingly. For example, asking a question like “Was ist die Hauptstadt der Deutschland?” to evaluate the model’s ability to handle prompts in German. Hauptstadt (if you didn’t know) means capital city.
Maths Injection: Making LLMs perform mathematical calculations to gauge its capability for handling complex tasks. For example, crafting attack prompts related to the target context, such as asking about meditation techniques after including a mathematical calculation.
Context-Switch: Acting as if staying within the target context while posing unrelated questions to assess if sensitive information can be extracted. For example, combining questions about meditation techniques with unrelated inquiries about the exact area of Turkey to test the model’s ability to provide answers outside its designated context.
External Browsing: Testing if the LLM instance can browse to a provided URL, retrieve content and incorporate it into its responses. For example, asking the model to browse through a specified URL and provide information from an article regarding the benefits of meditation by a renowned expert.
External Prompt Injection: Testing if an LLM model can access a provided URL to retrieve additional prompts from external sources. This looks like requesting the model to explore a specified website and incorporate insights from the content found there into its responses about recommended meditation resources.
Robust security measures are needed to protect against unauthorised manipulation of language models. It is crucial for developers and organisations to be aware of these vulnerabilities and implement safeguards to secure language models against such attacks.
CWDN: Continuing this thought and thread, what is the fine-tuning phase for LLMs?
Mirabile: The fine-tuning phase for LLMs is a crucial step in adapting these general-purpose models to specific tasks or applications, enhancing their performance for real-world applications. While standard LLMs exhibit proficiency in generalised tasks, their effectiveness can be limited in domain-specific scenarios. Fine-tuning addresses this limitation by customising the model to suit particular tasks or applications through further training on domain-specific datasets.
In essence, LLM fine-tuning involves adjusting the parameters of a pre-trained model to align with the characteristics of a new domain or specific task. This process optimises the model’s performance, ensuring more relevant and accurate outputs for improved user experiences in real-world applications.
Contrary to a common misconception, fine-tuning LLMs does not necessarily require large training datasets. The extensive pre-training that these models undergo on diverse and large ‘knowledge corpora’ equips them with a foundational knowledge base. This pre-existing knowledge allows LLMs to adapt effectively to smaller, targeted datasets during fine-tuning. Therefore, the emphasis in the fine-tuning phase lies more on proper data curation and quality rather than sheer volume alone.
The benefits of fine-tuning language models are manifold. LLMs can be fine-tuned for specific downstream tasks such as text classification, sentiment analysis, machine translation and question-answering across domains like finance, healthcare and cybersecurity. Fine-tuning allows users to tailor a foundational model to a particular domain, enhancing its ability to recognise and understand domain-specific content. For instance, a fine-tuned model can be adept at medical diagnosis based on symptoms in the healthcare domain.
Fine-tuned models can be integrated into various applications to enhance user experience. For example, a retailer can fine-tune an LLM to develop a chatbot that provides personalised recommendations to visitors, aiding them in making informed purchasing decisions.
CWDN: Looking ahead, will LLMs become more task-specific and industry-specific and will we ever be in a position where LLMs are as subsumed and integrated into our enterprise software fabric as the spellchecker in our favourite Word-type app?
Mirabile: The current trajectory suggests that LLMs will likely become increasingly task-specific and industry-specific. This trend is expected to continue, potentially leading to the emergence of an industrial copilot that leverages LLMs to streamline and optimise industrial processes. By fine-tuning models for understanding and generating language specific to industrial workflows, these LLMs can serve as intelligent assistants, offering valuable insights, automating routine tasks and enhancing overall workforce efficiency.
The integration of task-specific LLMs into enterprise software and hardware holds the potential for diverse applications across various industries. In manufacturing, these models could aid in quality control and predictive maintenance. In retail, they might augment retail assistant product knowledge, help in generating compelling product descriptions, improve online customer interactions and provide personalised shopping recommendations based on individual preferences and trends.
As observed with Microsoft’s phi-2 and Google’s Gemini nano, smaller language models designed to run directly on enterprise hardware, such as the Zebra Technologies TC family of devices, are becoming more prevalent. This approach can contribute to the development of personalised assistants with improved efficiency, privacy and security, as the models operate locally on the user’s device, minimising the need for external data transmission and enhancing control over sensitive information.