Title:
Code-Upload AI Challenges on EvalAI

dc.contributor.advisor Batra, Dhruv
dc.contributor.author Jain, Rishabh
dc.contributor.committeeMember Parikh, Devi
dc.contributor.committeeMember Lee, Stefan
dc.contributor.department Interactive Computing
dc.date.accessioned 2021-06-10T16:51:03Z
dc.date.available 2021-06-10T16:51:03Z
dc.date.created 2021-05
dc.date.issued 2021-05-04
dc.date.submitted May 2021
dc.date.updated 2021-06-10T16:51:04Z
dc.description.abstract Artificial intelligence develops techniques and systems whose performance must be evaluated on a regular basis in order to certify and foster progress in the discipline. We have developed several tools such as EvalAI which helps us in evaluating the performance of these systems and to push the frontiers of machine learning and artificial intelligence. Initially, the AI community focussed on simple and traditional methods of evaluating these systems in the form of prediction upload challenges but with the advent of deep learning, larger datasets, and complex AI agents, etc. these methods are not sufficient for evaluation. A technique to evaluate these AI agents is by uploading their code, running it on the sequestered test dataset, and reporting the results on the leaderboard. In this work, we introduced code upload evaluation of AI agents on EvalAI for all kinds of AI tasks, i.e.reinforcement learning, supervised learning, and unsupervised learning. We offer features such as scalable backend, prioritized submission evaluation, secure test environment, and running AI agents code in an isolated sanitized environment. The end-to-end pipeline is extremely flexible, modular, and portable which can later be extended to multi-agents setups and evaluation on dynamic datasets. We also proposed a procedure using GitHub for AI challenge creation to version, maintain, and reduce the friction in this conglomerate process. Finally, we focused on providing analytics to all the users of the platform along with easing the hosting of EvalAI on private servers as an internal evaluation platform.
dc.description.degree M.S.
dc.format.mimetype application/pdf
dc.identifier.uri http://hdl.handle.net/1853/64704
dc.language.iso en_US
dc.publisher Georgia Institute of Technology
dc.subject AI Challenge Evaluation
dc.subject Machine Learning
dc.subject Artificial Intelligence
dc.subject Code-Upload AI Challenges
dc.subject EvalAI
dc.subject AI Agent Evaluation
dc.subject Reinforcement Learning Challenges
dc.subject Reinforcement Learning
dc.title Code-Upload AI Challenges on EvalAI
dc.type Text
dc.type.genre Thesis
dspace.entity.type Publication
local.contributor.corporatename College of Computing
local.contributor.corporatename School of Interactive Computing
local.relation.ispartofseries Master of Science in Computer Science
relation.isOrgUnitOfPublication c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isOrgUnitOfPublication aac3f010-e629-4d08-8276-81143eeaf5cc
relation.isSeriesOfPublication 3ef9b3be-896e-4b1b-8aa6-e24d540b7d43
thesis.degree.level Masters
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
JAIN-THESIS-2021.pdf
Size:
10.12 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
3.86 KB
Format:
Plain Text
Description: