Submit and evaluate models on GAIA leaderboard
Generate text based on prompts
test
Generate code for applications
testing then testing locally