In this series, I edit podcasts into 3 points.
This first installment is based on Practical AI podcast #138
Episode hosts Dan and Chris have a conversation with William Falcon, creator of PyTorch Lightning & CEO of Grid AI
The Three Points Series #1 – Why Using The World’s Top Research Lab Will Make Your Life Better
P1 Situation: Current private and academic research systems hinder AI and ML progress
a) Dissemination of research: It usually costs thousands of dollars for researchers to publish on Elsevier-managed journals… To read the entire paper, their colleagues have to access these journals such as one large scientific publisher, Springer Nature, which has online-only prices that average about $2,020 per year in 2021.
b) Registration of precedence: It make(s) it a lot more expensive and painful to look for prior art so that you don’t end up repeating someone else’s work by accident.
c) Providing a fixed archival version for future reference: When privately-owned journals go under, their archives can simply vanish from the internet.
d) Certification of quality: The replication crisis that first emerged in the early 2010’s has since spread to numerous other fields. Multiple studies from the past decade show that well over half the surveyed papers in various disciplines failed to replicate.
– Balaji Srinivasan’s, https://1729.com/crypto-sci-hub
P2 PyTorch Lighting: Framework advances research and application
As existing private and academic research systems become diffuse, PyTorch Lighting concentrates resources and information in one place.
William Falcon created PyTorch Lighting along with an open source community of people grappling with the systemic and technical challenges already present with new research. Inspired through the frustrations of AI research during his Ph.D. work at NYU and later on at FAIR. (Facebook Artificial Intelligence Research)
“My vision was
Can we build the world’s research lab?
Can we all have access to top researchers and resources?”
PyTorch Lighting is doing just that. The open source project is approaching 500 contributors. Top researchers and Ph.D’s from all over the world are implementing AI projects and putting them into papers which are then available within a few hours, ready and usable for everyone.
The idea of the world’s research lab is to be able to stand on the shoulders of giants before even starting your work. This leads to time spent on new innovations or improving from the past, instead of going through all the frustrations required to get to the same place other people have already figured out.
A major goal of the project is to solve for issues in efficiency and communication through interoperability. For those who are not familiar with this term, it means that the combination and sequence originally used with the data, model and hardware is abstracted up to retain the seed or source in a single callable location. This location is then called upon by an individual researcher to iterate from the exact parameters of their original idea, or is shared and iterated upon by collaborators within their team or other teams. This gives you and the teams you work with the ability to iterate faster and more accurately, with state of the art version control.
“PyTorch Lighting is a research project – how do you factor out deep learning code and make it interoperable… The outcome of doing anything with AI is a function of how fast you iterate through ideas.”
William Falcon
P3 Grid AI: Power through iterations faster to get applied outcomes
“How fast you can power through ideas is probably the single biggest predictor if that thing is going to work or not.”
William Falcon
Grid tooling is similar to the offerings of database tooling. You can try to in-house the system design for a database, but inevitably the in-house hardware limitation on bursting compute power makes outside solutions a more reliable option. As history has shown with database services overwhelmingly proving out the advantages, one example being Amazon Web Services (AWS) furthering its sector growth in 2021 to make up 52% of Amazon’s Operating Revenue1, the same concept holds for AI and ML service solutions.
“Running on Grid means that we install your dependencies, everything you need to link up your data, in a matter of minutes, if not seconds. It’s just there, and it’s repeatable, and things start immediately, so it’s a lot cheaper. With Grid you can go spin up 200 GPUs, run for five minutes and shut them down, and you just got a lot done. Whereas on your own machines, even if you were to do it yourself on the cloud, you would probably not even get the models running for 20 minutes while you spin up the machines and set up all that stuff.”
William Falcon
Using Grid means that all the hard work has been done in advance through accumulated knowledge. To decide on the capacity to in-house this, access for yourself. Am I using my own hardware networking or AWS? In the former, go for it. If the latter, you will likely save a great deal of headache by leveraging the benefits of specialization.
1 https://www.techradar.com/news/aws-is-now-a-bigger-part-of-amazons-overall-success-than-ever