On this tutorial, we stroll by developing an advanced PaperQA2 AI Agent powered by Google’s Gemini model, designed significantly for scientific literature analysis. We organize the environment in Google Colab/Pocket ebook, configure the Gemini API, and mix it seamlessly with PaperQA2 to course of and query a lot of evaluation papers. By the highest of the setup, we’ve got now an intelligent agent capable of answering superior questions, performing multi-question analyses, and conducting comparative evaluation all through papers, all whereas providing clear options with proof from provide paperwork. Check out the Full Codes proper right here.
!pip arrange paper-qa>=5 google-generativeai requests pypdf2 -q
import os
import asyncio
import tempfile
import requests
from pathlib import Path
from paperqa import Settings, ask, agent_query
from paperqa.settings import AgentSettings
import google.generativeai as genai
GEMINI_API_KEY = "Use Your Private API Key Proper right here"
os.environ["GEMINI_API_KEY"] = GEMINI_API_KEY
genai.configure(api_key=GEMINI_API_KEY)
print("✅ Gemini API key configured effectively!")
We begin by placing within the required libraries, along with PaperQA2 and Google’s Generative AI SDK, after which import the obligatory modules for our enterprise. We set our Gemini API key as an environment variable and configure it, making sure the mixture is ready for use. Check out the Full Codes proper right here.
def download_sample_papers():
"""Receive sample AI/ML evaluation papers for demonstration"""
papers = {
"attention_is_all_you_need.pdf": "https://arxiv.org/pdf/1706.03762.pdf",
"bert_paper.pdf": "https://arxiv.org/pdf/1810.04805.pdf",
"gpt3_paper.pdf": "https://arxiv.org/pdf/2005.14165.pdf"
}
papers_dir = Path("sample_papers")
papers_dir.mkdir(exist_ok=True)
print("📥 Downloading sample evaluation papers...")
for filename, url in papers.devices():
filepath = papers_dir / filename
if not filepath.exists():
attempt:
response = requests.get(url, stream=True, timeout=30)
response.raise_for_status()
with open(filepath, 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
print(f"✅ Downloaded: {filename}")
apart from Exception as e:
print(f"❌ Did not acquire {filename}: {e}")
else:
print(f"📄 Already exists: {filename}")
return str(papers_dir)
papers_directory = download_sample_papers()
def create_gemini_settings(paper_dir: str, temperature: float = 0.1):
"""Create optimized settings for PaperQA2 with Gemini fashions"""
return Settings(
llm="gemini/gemini-1.5-flash",
summary_llm="gemini/gemini-1.5-flash",
agent=AgentSettings(
agent_llm="gemini/gemini-1.5-flash",
search_count=6,
timeout=300.0,
),
embedding="gemini/text-embedding-004",
temperature=temperature,
paper_directory=paper_dir,
reply=dict(
evidence_k=8,
answer_max_sources=4,
evidence_summary_length="about 80 phrases",
answer_length="about 150 phrases, nevertheless is perhaps longer",
max_concurrent_requests=2,
),
parsing=dict(
chunk_size=4000,
overlap=200,
),
verbosity=1,
)
We acquire a set of well-known AI/ML evaluation papers for our analysis and retailer them in a loyal folder. We then create optimized PaperQA2 settings configured to utilize Gemini for all LLM and embedding duties, fine-tuning parameters like search rely, proof retrieval, and parsing for surroundings pleasant and proper literature processing. Check out the Full Codes proper right here.
class PaperQAAgent:
"""Superior AI Agent for scientific literature analysis using PaperQA2"""
def __init__(self, papers_directory: str, temperature: float = 0.1):
self.settings = create_gemini_settings(papers_directory, temperature)
self.papers_dir = papers_directory
print(f"🤖 PaperQA Agent initialized with papers from: {papers_directory}")
async def ask_question(self, question: str, use_agent: bool = True):
"""Ask a question regarding the evaluation papers"""
print(f"n❓ Question: {question}")
print("🔍 Searching by evaluation papers...")
attempt:
if use_agent:
response = await agent_query(query=question, settings=self.settings)
else:
response = ask(question, settings=self.settings)
return response
apart from Exception as e:
print(f"❌ Error processing question: {e}")
return None
def display_answer(self, response):
"""Present the reply with formatting"""
if response is None:
print("❌ No response obtained")
return
print("n" + "="*60)
print("📋 ANSWER:")
print("="*60)
answer_text = getattr(response, 'reply', str(response))
print(f"n{answer_text}")
contexts = getattr(response, 'contexts', getattr(response, 'context', []))
if contexts:
print("n" + "-"*40)
print("📚 SOURCES USED:")
print("-"*40)
for i, context in enumerate(contexts[:3], 1):
context_name = getattr(context, 'title', getattr(context, 'doc', f'Provide {i}'))
context_text = getattr(context, 'textual content material', getattr(context, 'content material materials', str(context)))
print(f"n{i}. {context_name}")
print(f" Textual content material preview: {context_text[:150]}...")
async def multi_question_analysis(self, questions: guidelines):
"""Analyze a lot of questions in sequence"""
outcomes = {}
for i, question in enumerate(questions, 1):
print(f"n🔄 Processing question {i}/{len(questions)}")
response = await self.ask_question(question)
outcomes = response
if response:
print(f"✅ Achieved: {question[:50]}...")
else:
print(f"❌ Failed: {question[:50]}...")
return outcomes
async def comparative_analysis(self, matter: str):
"""Perform comparative analysis all through papers"""
questions = [
f"What are the key innovations in {topic}?",
f"What are the limitations of current {topic} approaches?",
f"What future research directions are suggested for {topic}?",
]
print(f"n🔬 Starting comparative analysis on: {matter}")
return await self.multi_question_analysis(questions)
async def basic_demo():
"""Exhibit elementary PaperQA efficiency"""
agent = PaperQAAgent(papers_directory)
question = "What is the transformer construction and why is it important?"
response = await agent.ask_question(question)
agent.display_answer(response)
print("🚀 Working elementary demonstration...")
await basic_demo()
async def advanced_demo():
"""Exhibit superior multi-question analysis"""
agent = PaperQAAgent(papers_directory, temperature=0.2)
questions = [
"How do attention mechanisms work in transformers?",
"What are the computational challenges of large language models?",
"How has pre-training evolved in natural language processing?"
]
print("🧠 Working superior multi-question analysis...")
outcomes = await agent.multi_question_analysis(questions)
for question, response in outcomes.devices():
print(f"n{'='*80}")
print(f"Q: {question}")
print('='*80)
if response:
answer_text = getattr(response, 'reply', str(response))
display_text = answer_text[:300] + "..." if len(answer_text) > 300 else answer_text
print(display_text)
else:
print("❌ No reply on the market")
print("n🚀 Working superior demonstration...")
await advanced_demo()
async def research_comparison_demo():
"""Exhibit comparative evaluation analysis"""
agent = PaperQAAgent(papers_directory)
outcomes = await agent.comparative_analysis("consideration mechanisms in neural networks")
print("n" + "="*80)
print("📊 COMPARATIVE ANALYSIS RESULTS")
print("="*80)
for question, response in outcomes.devices():
print(f"n🔍 {question}")
print("-" * 50)
if response:
answer_text = getattr(response, 'reply', str(response))
print(answer_text)
else:
print("❌ Analysis unavailable")
print()
print("🚀 Working comparative evaluation analysis...")
await research_comparison_demo()
̌We define a PaperQAAgent that makes use of our Gemini-tuned PaperQA2 settings to look papers, reply questions, and cite sources with clear present helpers. We then run elementary, superior multi-question, and comparative demos so we’ll interrogate literature end-to-end and summarize findings successfully. Check out the Full Codes proper right here.
def create_interactive_agent():
"""Create an interactive agent for personalized queries"""
agent = PaperQAAgent(papers_directory)
async def query(question: str, show_sources: bool = True):
"""Interactive query carry out"""
response = await agent.ask_question(question)
if response:
answer_text = getattr(response, 'reply', str(response))
print(f"n🤖 Reply:n{answer_text}")
if show_sources:
contexts = getattr(response, 'contexts', getattr(response, 'context', []))
if contexts:
print(f"n📚 Primarily based totally on {len(contexts)} sources:")
for i, ctx in enumerate(contexts[:3], 1):
ctx_name = getattr(ctx, 'title', getattr(ctx, 'doc', f'Provide {i}'))
print(f" {i}. {ctx_name}")
else:
print("❌ Sorry, I couldn't uncover an answer to that question.")
return response
return query
interactive_query = create_interactive_agent()
print("n🎯 Interactive agent ready! Now you'll be able to ask personalized questions:")
print("Occasion: await interactive_query('How do transformers take care of prolonged sequences?')")
def print_usage_tips():
"""Print helpful utilization options"""
options = """
🎯 USAGE TIPS FOR PAPERQA2 WITH GEMINI:
1. 📝 Question Formulation:
- Be explicit about what you want to know
- Ask about comparisons, mechanisms, or implications
- Use domain-specific terminology
2. 🔧 Model Configuration:
- Gemini 1.5 Flash is free and reliable
- Modify temperature (0.0-1.0) for creativity vs precision
- Use smaller chunk_size for increased processing
3. 📚 Doc Administration:
- Add PDFs to the papers itemizing
- Use vital filenames
- Mix varied sorts of papers for increased safety
4. ⚡ Effectivity Optimization:
- Prohibit concurrent requests with out price tier
- Use smaller evidence_k values for sooner responses
- Cache outcomes by saving the agent state
5. 🧠 Superior Utilization:
- Chain a lot of questions for deeper analysis
- Use comparative analysis for evaluation evaluations
- Combine with totally different devices for full workflows
📖 Occasion Inquiries to Try:
- "Consider the attention mechanisms in BERT vs GPT fashions"
- "What are the computational bottlenecks in transformer teaching?"
- "How has pre-training developed from word2vec to modern LLMs?"
- "What are the necessary factor enhancements that made transformers worthwhile?"
"""
print(options)
print_usage_tips()
def save_analysis_results(outcomes: dict, filename: str = "paperqa_analysis.txt"):
"""Save analysis outcomes to a file"""
with open(filename, 'w', encoding='utf-8') as f:
f.write("PaperQA2 Analysis Resultsn")
f.write("=" * 50 + "nn")
for question, response in outcomes.devices():
f.write(f"Question: {question}n")
f.write("-" * 30 + "n")
if response:
answer_text = getattr(response, 'reply', str(response))
f.write(f"Reply: {answer_text}n")
contexts = getattr(response, 'contexts', getattr(response, 'context', []))
if contexts:
f.write(f"nSources ({len(contexts)}):n")
for i, ctx in enumerate(contexts, 1):
ctx_name = getattr(ctx, 'title', getattr(ctx, 'doc', f'Provide {i}'))
f.write(f" {i}. {ctx_name}n")
else:
f.write("Reply: No response availablen")
f.write("n" + "="*50 + "nn")
print(f"💾 Outcomes saved to: {filename}")
print("✅ Tutorial full! You now have a totally sensible PaperQA2 AI Agent with Gemini.")
We create an interactive query helper that allows us to ask personalized questions on demand and optionally view cited sources. We moreover print smart utilization options and add a saver that writes every Q&A with provide names to a outcomes file, wrapping up the tutorial with a ready-to-use workflow.
In conclusion, we effectively created a totally sensible AI evaluation assistant that leverages the tempo and suppleness of Gemini with the sturdy paper processing capabilities of PaperQA2. We’re capable of now interactively uncover scientific papers, run centered queries, and even perform in-depth comparative analyses with minimal effort. This setup enhances our potential to digest superior evaluation and as well as streamlines the whole literature analysis course of, enabling us to take care of insights fairly than information searching.
Check out the Full Codes proper right here. Be at liberty to check out our GitHub Internet web page for Tutorials, Codes and Notebooks. Moreover, be at liberty to adjust to us on Twitter and don’t neglect to affix our 100k+ ML SubReddit and Subscribe to our E-newsletter.
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is devoted to harnessing the potential of Artificial Intelligence for social good. His latest endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth safety of machine finding out and deep finding out info that’s every technically sound and easily understandable by a big viewers. The platform boasts of over 2 million month-to-month views, illustrating its recognition amongst audiences.
Elevate your perspective with NextTech Info, the place innovation meets notion.
Uncover the newest breakthroughs, get distinctive updates, and be part of with a world neighborhood of future-focused thinkers.
Unlock tomorrow’s developments in the meanwhile: be taught further, subscribe to our e-newsletter, and grow to be part of the NextTech neighborhood at NextTech-news.com
Keep forward of the curve with NextBusiness 24. Discover extra tales, subscribe to our e-newsletter, and be part of our rising neighborhood at nextbusiness24.com