According to new research, Python has become the preferred programming language for developers creating generative AI applications.
“Python is the language of choice for AI programming,” says a report from cloud data company Snowflake, which analyzed usage data from 9,000 of its customers.
Python usage grew 571% on its Snowpark platform, significantly more than any other language year over year. Use of other languages also grew, such as Scala (up 387%) and Java (up 131%), but not as quickly.
“Python skills will become increasingly essential for development teams as they venture into advanced AI,” the report says.
This is because Python has a lot to offer, according to the report. The programming language is easy to learn and read, “allowing developers to focus on solving AI problems rather than analyzing abstract syntax.”
It has a large ecosystem of libraries and frameworks to simplify otherwise daunting AI tasks, “from implementing neural networks to natural language processing.” And there is also a large active community of contributors who help with learning and problem solving.
“Overall, Python allows developers to focus on the problem, not the language. “They can work fast, accelerating prototyping and experimentation and therefore overall learning as development teams make early forays into cutting-edge AI projects,” the report says.
Snowflake's research follows an analysis earlier this month that showed enthusiasm for Python continues to grow. The Tiobe programming language ranking noted that the gap between Python and the rest of the pack has never been larger.
Python supports unstructured data enhancements
The Snowflake report also found that companies are taking advantage of their unstructured data. Most data (perhaps up to 90%) is unstructured, in the form of videos, images and documents, but the company said it saw unstructured data processing grow by 123%.
“That's good news for many uses, including advanced AI,” the report says. “Proprietary data will give the advantage to large language models, so unlocking that underutilized 90% has great value.”
It should be noted that this type of data is processed with Python, Java and Scala.
“Given that Python in particular is the language of choice for many developers, data engineers, and data scientists, its rapid adoption suggests that these unstructured data workflows are not just for building data pipelines, but also involve data applications. artificial intelligence and machine learning models. Snowflake said.
Snowflake records an “LLM explosion”
Certainly, building AI-powered generative applications on top of large language models (LLMs) is now a priority for many developers.
“The LLM explosion is happening now, probably in his office,” Snowflake said.
The company said that in the last year, in its Streamlit developer community, it saw 20,076 developers working on 33,143 LLM-powered applications. Nearly two-thirds of developers said they were working on work projects.
While generative AI is not yet an all-encompassing technology, Snowflake said that “we are definitely seeing a lot of effort to get there as soon as possible,” underscoring the intense enterprise interest in using AI tools and applications.
The type of apps developers create is also evolving: Snowflake said that between May 2023 and January 2024 on Streamlit, chatbots went from 18% of LLM apps to 46%.
This most likely does not represent a change in the market appetite for LLM applications, but it shows how developers are increasing their skills and are able to create more complex chatbot applications.
Developers said their main concern when creating generative AI applications was whether the LLM's response was accurate (a reference to the current problem of AI hallucinations), followed by concerns about data privacy.
In addition to this, companies are also taking a more proactive approach to data governance. The number of tags applied to an object increased by 72%, while the number of objects with a tag directly assigned increased nearly 80%, and the number of masking or row access policies applied increased by 98%.
But Snowflake also said that the cumulative number of queries made against policy-protected objects increased by 142%. This is particularly significant, he said, because it shows that companies are increasing their use of data while ensuring responsible use.
“We are seeing more and more governance through the use of labels and masking policies, but the amount of work being done with this more carefully governed data is increasing rapidly.”
Jennifer Belissent, chief data strategist at Snowflake, said that while data security has long been a key focus, the rapid acceleration of artificial intelligence applications has brought the issue to the forefront. Addressing issues such as privacy and security “offers peace of mind.”
“When data is protected, it can be used safely,” he said.
“Taken individually, each of these trends is a single data point that shows how organizations around the world are addressing different challenges. When considered together, they tell a broader story about how CIOs, CTOs, and CDOs are modernizing. their organizations, addressing AI experiments and data problem solving – all necessary steps to take advantage of the opportunities presented by advanced AI.”