GEN AI QA

 




Question 1:

Which LangChain component is responsible for generating the linguistic output in a chatbot system?

a) Document Loaders
b) Vector Stores
c) LangChain Application
d) LLMs ✅


Question 2:
Which statement best describes the role of encoder and decoder models in natural language processing?

a) Encoder models and decoder models both convert sequences of words into vector representations without generating new text.
b) Encoder models take a sequence of words and predict the next word in the sequence, whereas decoder models convert a sequence of words into a numerical representation
c) Encoder models convert a sequence of words into a vector representation to generate a sequence of words ✅
d) Encoder models are used only for numerical calculations, whereas decoder models are used to interpret the calculated numerical values back into text


Question 3:
What does a higher number assigned to a token signify in the “Show Likelihoods” feature of the language model token generation?

a) The token is less likely to follow the current token.
b) The token is more likely to follow the current token. ✅
c) The token is unrelated to the current token and will not be used.
d) The token will be the only one considered in the next generation step.


4) Which statement is true about the “Top p” parameter of the OCI Generative AI Generation models?

a) "Top p" selects tokens from the “Top k” tokens sorted by probability
b) "Top p" assigns penalties to frequent occurring tokens
c) ✅ "Top p" limits token selection based on the sum of their probabilities.
d) "Top p" determines the maximum number of tokens per response.


5) How does a presence penalty function in language model generation when using OCI Generative AI service?

a) It penalizes all tokens equally, regardless of how often they have appeared 

b) It only penalizes tokens that have never appeared in the text before 

c) It applies a penalty only if the token has appeared more than twice 

d) It penalizes a token each time it appears after the first occurrence ✅


6) Which is NOT a typical use case for LangSmith Evaluators?

a) Measuring coherence of generated text 
b) Aliging code readability ✅
c) Evaluating factual accuracy of outputs 
d) Detecting bias or toxicity 


7) What does the term “Hallucination” refer in the context of Large Language Models (LLMs)?

a) The model’s ability to generate imaginative and creative content
b) A Technique used to enhance the model’s performance on specific tasks
c) The process by which the model visualizes and describes images in details
d)  The phenomenon where the model generates factually incorrect information or unrelated content as if it were true ✅


8) Given the following code:
PromptTemplate(input_variable=[“human_input”,”city”], template=template)
Which statement is true about PromptTemplate in relation to input_variables?

a) PromptTemplate requires a minimum of two variables to function properly.
b) PromptTemplate can support only a single variable at a time.
c) PromptTemplate supports any number of variables, including the possibility of having none. ✅
d) PromptTemplate is unable to use any variables.


9) What does the Loss metric indicate about a model’s predictions?

a) Loss measures the total number of predictions made by model.
b) Loss is a measure that indicates how wrong the model’s predictions are.✅
c) Loss indicates how good a predictions is, and it should increase as the model improves.
d) Loss describes the accuracy of the right predictions rather than the incorrect ones.


10) What does “k-shot prompting” refer to when using Large Language Models for task-specific applications?

a) Providing the exact k words in the prompt to guide the model’s response
b) Explicitly providing k examples of the intended task in the prompt to guide the model’s output ✅
c) The process of training the model on k different tasks simultaneously to improve its versatility
d) Limiting the model to only k possible outcomes or answers for a given task


11) When does a chain typically interact with memory in a run within the LangChain framework?

a) Only after the output has been generated
b) Before user input and after chain execution
c)  After user input but before chain execution, and again after core logic but before output ✅
d) Continuously throughout the entire chain execution process


12) What does the RAG Sequence model do in the context of generating a response?

a) It retrieves a single document for the entire input query and generates a response based on that alone.
b) For each input query, it retrieves a set of relevant documents and considers them together to generate a cohesive response. ✅
c) It retrieves relevant documents only for the initial part of the query and ignores the rest.
d) It modifies the input query before retrieving relevant documents to ensure a diverse response.


13) What does the Ranker do in a text generation system?

a) It generates the final text based on the user’s query
b) It sources information from databases to use in text generation
c) It evaluates and prioritizes the information retrieved by the Retriever ✅
d) It interacts with the user to understand the query better


14) In which scenario is soft prompting especially appropriate compared to other training styles?

a) When there is a significant amount of labeled, task-specific data available.
b) When the model needs to be adapted to perform well in a different domain it was not originally trained on.
c) When there is a need to add learnable parameters to a Large Language Model (LLM) without task-specific training ✅
d) When the model requires continued pre-training on unlabeled data.



15) Why is it challenging to apply diffusion models to text generation?

a) Because text generation does not require complex models
b) Because text is not categorical
c) Because text representation is categorical unlike images ✅
d) Because diffusion models can only produce images


16) Which statement describes the difference between “Top k” and “Top p” in selecting the next token in the OCI Generative AI Generation models?

a) "Top k" and "Top p" are identical in their approach to token selection but differ in their application of penalties to tokens
b) "Top k" selects the next token based on its position tokens, whereas "Top p" selects based on the cumulative probability of the top tokens ✅
c) "Top k" considers the sum of probabilities of the top tokens, whereas "Top p" selects from the "Top p" selects from the "Top k" token sorted by probability
d) "Top k" and "Top p" both select from the same set of tokens but use different methods to prioritize them based on frequency


17) What is LangChain?

a) A JavaScript library for natural language processing
b) A Python library for building applications with Large Language Models ✅
c) A Java library for text summarization
d) A Ruby library for text generation


18) Which component of Retrieval-Augmented Generation (RAG) evaluates and prioritizes the information retrieved by the retrieval system?

a) Encoder-Decoder
b) Ranker ✅
c) Generator
d) Retriever


19) Which statement is true about Fine-tuning and Parameter-Efficient Fine-Tuning (PEFT)?

a) Fine-tuning requires training the entire model on new data, often leading to substantial computational costs, whereas PEFT involves updating only a small subset of parameters, minimizing computational requirements and data needs.✅
b) PEFT requires replacing the entire model architecture with a new one designed specially for the new task, making it significantly more data-intensive than fine-tuning
c) Both Fine-tuning and PEFT require the model to be trained from scratch on new data, making them equally data and computationally intensive.
d) Fine-tuning and PEFT do not involve model modification; they differ only in the type of data used for training, with Fine-tuning requiring labeled data and PEFT using unlabeled data


20) What do prompt templates use for templating in language model applications?

a) Python’s list comprehension syntax
b) Python’s str.format syntax ✅
c) Python’s lambda functions
d) Python’s class and object structures


21) What is the primary function of the “temperature” parameter in the OCI Generative AI Generation models?

a) Controls the randomness of the model’s output, affecting its creativity ✅
b) Specifies a string that tells the model to stop generating more content
c) Assigns a penalty to tokens that have already appeared in the preceding text
d) Determines the maximum number of tokens the model can generate per response


22) Which statement accurately reflects the differences between these approaches in terms of the number of parameters modified and the type of data used?

a) Fine-tuning and continuous pretraining both modify all parameters and use labeled, task-specific data.


b) Parameter Efficient Fine-Tuning and Soft Prompting modify all parameters of the model using unlabeled data.


c) Fine-tuning modifies all parameters using labeled, task-specific data, whereas Parameter Efficient Fine-Tuning updates a few, new parameters also with labeled, task-specific data.✅

d) Soft prompting and continuous pretraining are both methods that requires no modification to the original parameters of the model.


23) Accuracy in vector databases contributes to the effectiveness of Large Language Models (LLMs) by preserving a specific type of relationship. What is the nature of these relationships, and why are they crucial for language models?

a) Linear relationships; they simplify the modelling process
b) Semantic relationships; crucial for understanding context and generating precise language ✅
c) Hierarchical relationship; important for structuring database queries
d) Temporal relationship; necessary for predicting future linguistic trends


24) What is the purpose of memory in the LangChain framework?

a) To retrieve user input and provide real-time output only
b) To store various types of data and provide algorithms for summarizing past interactions ✅
c) To perform complex calculations unrelated to user interaction
d) To act as a static database for storing permanent records


25) In the context of generating text with a Large Language Model (LLM), what does the process of greedy decoding entail?

a) Selecting a random word from the entire vocabulary at each step
b) Picking a word based on its position in a sentence structure
c) Choosing the word with the highest probability at each step of decoding ✅
d) Using a weighted random selection based on a modulated distribution


26) What does in-context learning in Large Language Models involve?

a) Pretraining the model on a specific domain
b) Training the model using reinforcement learning
c) Conditioning the model with task-specific instructions or demonstrations ✅
d) Adding more layers to the model


27) Which role does a “model endpoint” serve in the inference workflow of the OCI Generative AI service?

a) Update the weights of the base model during the fine-tuning process
b) Serves as a designated point for user requests and model responses ✅
c) Evaluates the performance metrics of the custom models
d) Hosts the training data for fine-tuning custom models


28) Which is a characteristic of T-Few fine-tuning for Large Language Models (LLMs)?

a) It updates all the weights of the model uniformly.
b) It does not update any weights but restructures the model architecture
c) It selectively updates only a fraction of the model’s weights. ✅
d) It increases the training time as compared to Vanilla fine-tuning


29) What is the main advantage of using few-shot model prompting to customize a Large Language Model (LLM)?

a) It allows the LLM to access a larger dataset.
b) It eliminates the need for any training or computational resources.
c) It provides examples in the prompt to guide the LLM to better performance with no training cost.✅
d) It significantly reduces the latency for each model request


30) What is the purpose of Retrieval Augmented Generation (RAG) in text generation?

a) To generate text based only on the model’s internal knowledge without external data
b) To generate text using extra information obtained from an external data source ✅
c) To store text in an external database without using it for generation
d) To retrieve text from an external source and present it without any modifications


31) What is the purpose of the “stop sequence” parameter in the OCI Generative AI Generation models?

a) It specifies a string that tells the model to stop generating more content. ✅
b) It assigns a penalty to frequently occurring tokens to reduce repetitive text.
c) It determines the maximum number of tokens the model can generate per response.
d) It controls the randomness of the model’s output, affecting creativity.


32) Which is a key characteristic of the annotation process used in T-Few fine-tuning?

a) T-Few fine-tuning uses annotated data to adjust a fraction of model weights ✅
b) T-Few fine-tuning requires manual annotation of input-output pairs.
c) T-Few fine-tuning involves updating the weights of all layers in the model
d) T-Few fine-tuning relies on unsupervised learning techniques for annotation


33) Which is the main characteristic of greedy decoding in the context of language model word prediction?

a) It chooses words randomly from the set of less probable candidates
b) It requires a large temperature setting to ensure diverse word selection.
c) It selects words based on a flattened distribution over the vocabulary ✅
d) It picks the most likely word at each step of decoding


34) Which is a key advantage of using T-Few over Vanilla fine-tuning in the OCI Generative AI service?

a) Reduced model complexity
b) Enhanced generalization to unseen data
c) Increase model interpretability ✅
d) Faster training time and lower cost


35) When is fine-tuning an appropriate method for customizing a Large Language Model (LLM)?

a) When the LLM already understands the topics necessary for text generation
b) When the LLM does not perform well on a task and the data for prompt engineering is too large ✅
c) When the LLM requires access to the latest data for generating outputs
d) When you want to optimize the model without any instructions


36) Which component of Retrieval-Augmented Generation (RAG) evaluates and prioritizes the information retrieved by the retrieval system?

a) Retriever
b) Encoder-Decoder
c) Generator ✅
d) Ranker


37) Which is NOT a built-in memory type in LangChain?

a) ConversationImageMemory ✅
b) ConversationBufferMemory
c) ConversationSummaryMemory
d) ConversationTokenBufferMemory


38) How do Dot Product and Cosine Distance differ in their application to comparing text embeddings in natural language processing?

a) Dot Product assesses the overall similarity in content, whereas Cosine Distance measures topical relevance
b) Dot Product is used for semantic analysis, whereas Cosine Distance is used for syntactic comparisons
c) Dot Product measures the magnitude and direction of vectors, whereas Cosine Distance focuses on the orientation regardless of magnitude. ✅
d) Dot Product calculates the literal overlap of words, whereas Cosine Distance evaluates the stylistic similarity


39) Given the following code block:

history = StreamlitChatMessageHistory(key="chat_messages") memory = ConversationBufferMemory(chat_memory=history)

Which statement is NOT true about StreamlitChatMessageHistory?

a) StreamlitChatMessageHistory will store messages in Streamlit session state at the specified key.
b) A given StreamlitChatMessageHistory will NOT be persisted.
c) A given StreamlitChatMessageHistory will not be shared across user sessions.
d) StreamlitChatMessageHistory can be used in any LLM application. ✅


40) How does the temperature setting in a decoding algorithm influence the probability distribution over the vocabulary?

a) Increasing the temperature removes the impact of the most likely word.
b) Decreasing the temperature broadens the distribution, making less likely words more probable.
c) Increasing the temperature flattens the distribution, allowing for more varied word choices. ✅
d) Temperature has no effect on probability distribution; it only changes the speed of decoding.



Post a Comment

0 Comments