Toward a Minor Tech:Luchs5000: Difference between revisions

From creative crowd wiki
Jump to navigation Jump to search
(inserted content)
 
No edit summary
 
(14 intermediate revisions by 5 users not shown)
Line 1: Line 1:
__NOTOC__
'''Inga Luchs'''
= AI for All?<br>Challenging the Democratization of Machine Learning =


[[Category:Toward a Minor Tech]]
<span class="running-header">AI for All? Challenging the Democratization of Machine Learning</span>
[[Category:5000 words]]
 
== Abstract ==
Research in artificial intelligence (AI) is heavily shaped by big tech today. In the US context, companies such as Google and Microsoft profit from a tremendous position of power due to their control over cloud computing, large data sets and AI talent. In light of this dominance, many media researchers and activists demand open infrastructures and community-led approaches to provide alternative perspectives – however, it is exactly this discourse that companies are appropriating for their expansion strategies. In recent years, big tech has taken up the narrative of democratizing AI by open-sourcing their machine learning (ML) tools, simplifying and automating the application of AI and offering free educational ML resources. The question that remains is how an alternative approach to ML infrastructures – and to the development of ML systems – can still be possible. What are the implications of big tech’s strive for infrastructural expansion under the umbrella of ‘democratization’? And what would a true democratization of ML entail? I will trace these two questions by critically examining, first, the open-source discourse advanced by big tech, as well as, second, the discourse around the AI open-source community Hugging Face that sees AI ethics and democratization at the heart of their endeavour. Lastly, I will show how ML algorithms need to be considered beyond their instrumental notion. It is thus not enough to simply hand over the technology to the community – we need to think about how we can conceptualize a radically different approach to the creation of ML systems.


= AI for All? Challenging the Democratization of Machine Learning =
<div class="page-break"></div>


== Introduction ==
== Introduction ==
Machine learning (ML) has grown to be a central area of artificial intelligence in the last decades. Ranging from search engine queries, over the filtering of spam e-mails and the recommendation of books and movies to the detection of credit card fraud and predictive policing, applications that are based on ML algorithms are taking over the classification tasks of our everyday life. These algorithmic operations, however, cannot be separated from the cultural sphere in which they emerge. Consequently, they are not only mirroring biases already existent in society, but are further deepening them, consolidating race, class, and gender as immutable categories (Apprich, “Introduction”).  
Machine learning (ML) has grown to be a central area of artificial intelligence in the last decades. Ranging from search engine queries, over the filtering of spam e-mails and the recommendation of books and movies to the detection of credit card fraud and predictive policing, applications that are based on ML algorithms are taking over the classification tasks of our everyday life. These algorithmic operations, however, cannot be separated from the cultural sphere in which they emerge. Consequently, they are not only mirroring biases already existing in society, but are further deepening them, consolidating race, class, and gender as immutable categories (Apprich, “Introduction”).


The research and development of AI and ML algorithms is heavily shaped by big technology companies. In the U.S., for example, Google, Amazon, and Microsoft wield a great deal of power over the AI industry – because it is they who have the necessary cloud computing resources and data sets, but also the unique position to “hire the best from a limited pool of AI talent.” (Dyer-Witheford, Kjøsen and Steinhoff 43) In addition, they are increasingly offering AI or ML ‘as a service’. This includes the offer of ready-to-use AI technologies which external companies can feed into their products, and moreover open access to their infrastructures for the training and development of ML models (Srnicek).  
The research and development of AI and ML algorithms is heavily shaped by big technology companies. In the United States, for example, Google, Amazon, and Microsoft wield a great deal of power over the AI industry – because it is they who have the necessary cloud computing resources and data sets, but also the unique position to draw highly qualified AI talent (Dyer-Witheford, Kjøsen and Steinhoff 43). In addition, they are increasingly offering AI or ML ‘as a service’. This includes the offer of ready-to-use AI technologies that external companies can feed into their products, and moreover open-source access to their infrastructures for the training and development of ML models (Srnicek, “The Political Economy of Artificial Intelligence”).  


With respect to issues of algorithmic discrimination (O’Neil; Eubanks), the dominance of big tech in the development of ML is crucial because who is developing AI systems is significantly shaping how AI is imagined and developed – and these spaces “tend to be extremely white, affluent, technically oriented, and male.” (West et al. 6) Countering this problem, many critical media researchers plead for a participatory approach, including more diverse communities into the creation of AI systems (Costanza-Chock; Benjamin; D’Ignazio and Klein).  
With respect to the mentioned issues of algorithmic discrimination (O’Neil; Eubanks), the dominance of big tech in the development of ML is crucial because who is developing AI systems is significantly shaping how AI is imagined and developed – and these spaces “tend to be extremely white, affluent, technically oriented, and male.” (West et al. 6) Countering this problem, many critical media researchers plead for a participatory approach, including more diverse communities into the creation of AI systems (Costanza-Chock; Benjamin; D’Ignazio and Klein). In her book ''Race after Technology,'' for instance, Ruha Benjamin underlines that the development of AI systems must be guided by values other than economic interests and demands “a socially conscious approach to tech development that would require prioritizing equity over efficiency, social good over market imperatives.” (Benjamin 183) Further, following Benjamin, this re-design “cannot be limited to industry, nonprofit, and government actors, but must include community-based organizations that offer a vital set of counternarratives.” (Benjamin 188) According to the authors of the book ''Data Feminism'', Catherine D’Ignazio and Lauren F. Klein, this includes a firm stance against the forms of technological solutionism often performed by big tech. In this sense, they call to tackle problems of algorithmic discrimination not as ‘technical bias’ of the system, but rather to “address the source of the bias: structural oppression.” (D’Ignazio and Klein 63) Consequently, this perspective “leads to fundamentally different decisions about what to work on, who to work with, and when to stand up and say that a problem cannot and should not be solved by data and technology.” (ibid.)  


In her book ''Race after Technology'', for instance, Ruha Benjamin underlines that the development of AI systems must be guided by values other than economic interests and demands “a socially conscious approach to tech development that would require prioritizing equity over efficiency, social good over market imperatives.” (Benjamin 183) Further, following Benjamin, this re-design “cannot be limited to industry, nonprofit, and government actors, but must include community-based organizations that offer a vital set of counternarratives.” (Benjamin 188) According to the authors of the book ''Data Feminism'', Catherine D’Ignazio and Lauren F. Klein, this includes a firm stance against the forms of technological solutionism often performed by big tech. In this sense, they call to tackle problems of algorithmic discrimination not as ‘technical bias’ of the system, but rather to “address the source of the bias: structural oppression.” (D’Ignazio and Klein 63) Consequently, this perspective “leads to fundamentally different decisions about what to work on, who to work with, and when to stand up and say that a problem cannot and should not be solved by data and technology.” (ibid.
The “Design Justice Network”, a collective consisting of designers, developers, researchers and activists, assembles these demands. Taking up Joichi Ito’s call for ‘participant design’, this network has come up with several principles that should guide technological development, focusing on the inclusion of communities currently marginalized by AI systems and favoring collaborative approaches by “shar[ing] design knowledge and tools”, in order to “work towards sustainable, community-led and -controlled outcomes” (Costanza Chock 11-12). At the same time, it is exactly this discourse that big tech companies have appropriated: they, too, aim to ‘democratize’ AI – which entails both distributing its benefits as well as its tools to everyone.  


The “Design Justice Network”, a collective consisting of designers, developers, researchers and activists, assembles these demands. Taking up Joichi Ito’s (2018) call for a ‘participant design’, this network has come up with several principles that should guide technological design, focusing on the inclusion of communities that are often excluded and marginalized by current AI systems and favoring collaborative approaches by “shar[ing] design knowledge and tools”, in order to “work towards sustainable, community-led and -controlled outcomes” (Costanza Chock 11-12). At the same time, it is exactly this discourse that big tech companies have appropriated: they, too, aim to ‘democratize’ AI – which means both distributing its benefits as well as its tools to everyone.  
In this research essay, I will first outline the way big tech companies are utilizing the democratization discourse to their economic advantage, posing their ML infrastructures in a way that serves their expansion. Secondly, against the background of many media researchers’ call for ‘community-led practices’ in terms of AI systems, I will critically investigate the US-American AI company Hugging Face, which advertises a “community-centric approach”. Similar to the discourse around community-led AI, the company sees itself “on a journey to advance and democratize artificial intelligence through open source and open science”, decidedly opposing itself against big tech which has not had “a track record of doing the right thing for the community” (Goldman). In this regard, I aim to analyse what their notion of ‘democratization’ entails, particularly against the background of Hugging Face recently announcing its cooperation with Amazon Web Services (AWS).  


In this research essay, I will first outline the way big tech companies are utilizing the democratization and open-source discourse to their economic advantage, posing their ML infrastructures in a way that serves their expansion. Secondly, against the background of many media researchers’ call for ‘community-led practices’ in terms of AI systems, I will critically investigate the US-American AI company Hugging Face, which advertises itself as following a “community-centric approach”. Similarly to the existing discourse, the company sees itself “on a journey to advance and democratize artificial intelligence through open source and open science”, however, decidedly opposing itself against big tech which has not had “a track record of doing the right thing for the community” (Goldman). In this regard, I aim to analyse what their notion of ‘democratization’ entails, particularly against the background of Hugging Face recently announcing its cooperation with Amazon Web Services (AWS).  
While access to AI infrastructures and community-led AI development are certainly important, I will lastly show how ML algorithms need to be considered beyond their instrumental notion. It is thus not enough to simply hand over the technology to the community – we need to think about how we can conceptualize a radically different approach to the creation of ML systems. This particularly entails questioning the deeply capitalist notions along which ML and its infrastructures are currently developed, and how we might break with these values that have been nourished for decades and that are deeply intertwined with ML research, development and education.


While access and community-based development are certainly important, I will lastly show how ML algorithms need to be considered beyond their instrumental notion. It is thus not enough to simply hand over the technology to the community – we need to think about how we can conceptualize a radically different approach to the creation of ML systems. This particularly entails questioning the deeply capitalist notions along which ML – as well as its infrastructures – are currently developed, and how we might break with these values that have been nourished for decades and that are deeply intertwined with ML research, development and education.
== Tools and benefits “for everyone”: Big tech’s AI democratization ==
In the last decade, US-based tech companies primarily known for their social media platforms, search engines or online marketplaces, increasingly centered their endeavors around artificial intelligence. In 2017, for instance, CEO Sundar Pichai reported at the yearly Google I/O conference that the company will be focusing on an “AI first approach” (Google Developers). From then on, Google has been explicitly working on the integration of ML technologies into their products, such as its search engine, its YouTube recommendation algorithm or its file hosting service Google Drive. Around the same time, the research department Google AI was established – and also other big tech companies set up, or further invested into, their own AI sections (see, for instance, IBM, Microsoft, and Meta). Next to the integration of AI into their applications, these companies have moreover started to offer their AI technologies themselves as a product, moving their companies into the heart of the AI industry (Srnicek, “Data, Compute, Labor” 242).


== Tools and benefits “for everyone”: Big tech’s AI democratization ==
The corporate advances in the field of AI are accompanied by a discourse around ‘AI democratization’, which centers around the aim to make AI applications and infrastructures available to everyone. As Marcus Burkhardt details, this is targeted at users and developers:
In the last decade, big tech companies such as Amazon, Facebook, Microsoft and Google centered their endeavors more and more around artificial intelligence, emphasizing its potential for social progress. In 2017, for instance, CEO Sundar Pichai reported at the yearly Google I/O conference that the company will be focusing on an “AI first approach” (Google Developers). From then on, the company has been explicitly working on the integration of ML technologies into their products, such as in its search engine, its YouTube recommendation algorithm or in Google Drive. Around the same time, the research department ''Google AI'' was established, which is represented in its aspiration to “create technologies that solve important problems and help people in their daily lives”, emphasizing the potential of AI to “empower people, widely benefit current and future generations, and work for the common good.” (Google AI, “Principles”) Microsoft, too, underlines their mission to democratize AI “for every person and every organization”, grounded in the belief that the ‘essence’ of AI is “about helping everyone achieve more – humans and machines working together to make the world a better place.” (Microsoft News Center)
 
<blockquote>“For developers this democratization entails the possibility to make use of AI in their own products and to partake in shaping the future of AI by having open or paid access to resources and services […]. Users on the other hand are enlisted in the democratization of AI as beneficiaries of technologies that are ‘infused’ with artificial intelligence and machine learning.” (211)</blockquote>
 
For the latter narrative, the companies closely link the advancement of AI with societal progress. Google AI’s mission, for instance, is to “create technologies that solve important problems and help people in their daily lives”, emphasizing the potential of AI to “empower people, widely benefit current and future generations, and work for the common good.” (Google AI, “Principles”) Microsoft underlines its aspiration to democratize AI “for every person and every organization”, grounded in the belief that the ‘essence’ of AI is “about helping everyone achieve more – humans and machines working together to make the world a better place.” (Microsoft News Center) And Meta, states as its goal “to build AI responsibly, for everyone” and is “advancing AI for a more connected world.” (Meta AI)
 
The former narrative concerns the democratization of AI development, which corresponds to the open access provision of infrastructures necessary to do machine learning (such as data sets, cloud storage and computing resources, but also frameworks and libraries). Google offers a whole section on its AI website titled “Tools for everyone.” Here, the company claims: “We’re making tools and resources available so that anyone can use technology to solve problems. Whether you’re just getting started or you’re already an expert, find the resources you need to reach your next breakthrough.” (Google AI, “Tools”) This includes access to its open-source machine learning platform TensorFlow, as well as to Google datasets, pre-trained models and other training resources.  Microsoft states: “At Microsoft, we have an approach […] that seeks to democratize Artificial Intelligence (AI), to take it from the ivory towers and make it accessible for all”, which includes the availability of their “intelligent capabilities […] to every application developer in the world.” (Microsoft News Center) And Amazon Web Services deploys its cloud as means to “accelerat[e] the pace of innovation, democratiz[e] access to data, and allow[] researchers and scientists to scale, work collaboratively, and make new discoveries from which we may all benefit.” (Kratz)


However, in order to secure the best possible outcome for everyone, both companies not only state that it is necessary for everyone to benefit from AI’s potential by making AI-powered products available, but for everyone to be able to utilize AI for their own endeavors (Burkhardt). At this point, big tech companies generally pursue two distinct strategies under the frame of “AI democratization” (Sudmann).  
Furthermore, the companies aim to lower the barrier to AI development tools “so that even non-experts inside and outside companies and universities can increasingly use the corresponding technologies.” (Sudmann 23) This entails for instance a variety of educational resources on offer, in form of free ML introductory courses and training certificates which address not only experienced developers but also those that are looking for an entry point into ML development (Luchs, Apprich and Broersma). We can also notice a growing platformization of AI development tools, which leads to automatized and standardized forms of ML development and should facilitate anyone to develop ML systems without prerequisite knowledge (see, for instance, Google’s Vertex AI).  


The first strategy is the open access provision of infrastructures necessary to do machine learning (such as data sets, cloud storage and computing resources, but also frameworks and libraries). Google for instance offers a whole section on its AI website titled “Tools for everyone.” Here, the company claims: “We’re making tools and resources available so that anyone can use technology to solve problems. Whether you’re just getting started or you’re already an expert, find the resources you need to reach your next breakthrough.” (Google AI, “Tools”) This includes access to its open-source machine learning platform TensorFlow, as well as to Google datasets, pre-trained models and training resources.  And also Microsoft states: “At Microsoft, we have an approach […] that seeks to democratize Artificial Intelligence (AI), to take it from the ivory towers and make it accessible for all”, which includes the availability of their “intelligent capabilities […] to every application developer in the world.” (Microsoft News Center) And, as a last example, Amazon Web Services deploys its cloud as means to “accelerat[e] the pace of innovation, democrat[e] access to data, and allow[] researchers and scientists to scale, work collaboratively, and make new discoveries from which we may all benefit.” (Kratz)
For both users of AI applications and their developers, the democratization of AI revolves primarily around the notion of ''access'' – and increasing the availability of AI technologies does indeed facilitate their democratization in some regard: it enables a wide accessibility of ML infrastructures, and it expands – to some extent – the circle of those who can use and develop AI technologies in the first place. Nevertheless, this discourse must be viewed critically and as part of a larger historical trajectory that goes back to the beginnings of network technologies, their promises, but also their commodification. In this sense, the narrative that access to technologies serves as empowering for individuals, and that this, further, leads to more democratic societies, is by no means new. On the contrary, it has been integral part of the Silicon Valley’s “Californian Ideology” (Barbrook and Cameron) since its early beginnings – as Fred Turner for instance shows in his book ''From Counterculture to Cyberculture'' (2006), where he traces the origins of this digital utopianism.  


In this sense, ‘AI democratization’ specifically entails access – however, not only in the sense of open-source technologies and access to infrastructures. Additionally, the second strategy companies pursue is a lowering of the barrier as to who can access, which takes the form of “the simplification, standardization, and automation of AI, so that even non-experts inside and outside companies and universities can increasingly use the corresponding technologies.” (Sudmann 23) This entails for instance a variety of educational resources on offer, in form of free ML introductory courses and training certificates which address not only experienced developers but also those that are looking for an entry point into ML development (see, for instance, Google’s ''Machine Learning Crash Course''; Luchs, Apprich and Broersma). We can also notice a growing platformization of AI development tools, for instance in the guise of Google’s ''Vertex AI'', which offers automatized forms of ML development and allows anyone to develop ML systems without prerequisite knowledge.
One illustrative example are the virtual communities that began to appear in the 1980s with the advent of personal computers and bulletin board systems. For the first time, users could communicate across local barriers and in real-time, which facilitated the forming of connections in new ways (Apprich, ''Technotopia'' 90). As a result, these virtual communities were seen as a glimpse of the promise to “dissolve social hierarchies and enable a self-government of emancipated citizens” (ibid. 91). It is this faith in the liberating power of network technologies – further manifested by the Internet emerging in the 1990s – that still shapes Silicon Valley up until today. Mark Zuckerberg, founder of Facebook, for instance, “envisions a world in which individuals, communities, and nations create an ideal social order through the constant exchange of information – that is, through staying ‘connected’” (Turner, “Machine Politics”). What is even more important, however, is that the companies of the Silicon Valley see themselves in the responsibility of providing the necessary infrastructures. It thus the same narrative – the view that new technologies are facilitators of social progress – which seamlessly fits into their capitalist aims and which “proved enormously profitable across Silicon Valley. By justifying the belief that for-profit systems are the best way to improve public live, it has helped turn the expression of individual experience into raw material that can be mined, processed, and sold.” (ibid.)


Next to the belief of a general societal progress by widely integrating AI technologies into every application, as Dyer-Witheford, Kjøsen and Steinhoff aptly point out, “‘[d]emocratizing’ AI thus means generalizing both its deployment and the tools for creating it, making it increasingly available to end-users and allowing anyone, working in any field, even those without any AI training, to develop AI.(Dyer-Witheford, Kjøsen and Steinhoff 53)
We can tell a similar story when it comes to software development. As Nathaniel Tkacz shows, two movements emerged in its initial years, which displayed “two competing mutations of liberalism” (24): The Free Software Movement initiated by Richard Stallman in the 1980s, which declared that all software should be ‘free’ in terms of usage, distribution and modification – and thus non-proprietary; and the Open Source Initiative, founded by Bruce Perens and Eric Raymond in 1998 (ibid. 21-23), which accounted for a liberalism that facilitated economic growth and innovation. In his popular writing on ''The Cathedral and the Bazaar'' (2001), Raymond elaborates that the development of software should not be centrally controlled (as in his notion of the ''cathedral''), but rather as open as possible, allowing for a high degree of individual contributions (resembling a ''bazaar'')''.'' At the same time, companies should be able to make use of the increased productivity by commodifying the results. Raymond’s ''bazaar'' thus centers around a market for “competing ‘agendas and ideas’; progress ‘at a speed barely imaginable’; and the miraculous emergence of a ‘coherent and stable system’” (Tkacz 24, cited after Raymond).


While these democratization efforts do increase the availability of these technologies – both on the level of infrastructural access as well as access in form of broadening who can develop and apply AI technology – they nevertheless need to be viewed critically. In the context of the already existing dominance of big tech companies in the AI industry, and their evident economic interests in expanding the reach of their AI technologies and infrastructures, what is advertised as democratization must above all be viewed as expansion strategy on the side of the companies, where users are positioned as customers of corporate products. Here, by offering their infrastructures openly accessible, companies achieve that more developers are drawn to them, which makes the infrastructures more established in AI development generally. Further, by training new developers on their infrastructures, these become dependent on their products. And also the open-source discourse serves as means to drive ML research, which, consequently, leads to further improvement of their technologies (Metz).  
It is this economic line of thought that also dominates the AI industry today. Big tech companies have an evident economic interest in expanding the reach of their AI technologies and infrastructures. Hence, what is advertised as democratization must above all be viewed as expansion strategy, where users are positioned as customers of corporate products. By offering their infrastructures openly accessible, companies achieve that more developers are drawn to them, which makes the infrastructures more established in AI development generally. Further, by training new developers on their infrastructures, these become dependent on their products.  And – as we can see – the open-source discourse serves as means to drive ML research and to harness free contributions from the community, which, consequently, leads to further improvement of the corresponding AI technologies (Metz).  


Advances under the frame of democratization can thus be understood as measures to ensure for company-owned products to become “part of the general conditions of production”, serving as “source of robust no-cost programming, a potential recruitment ground, and a strategic site for attracting users to their platforms.” (Dyer-Witheford, Kjøsen and Steinhoff 54) Or, as the authors state at another instance: “If AI becomes generally available, it will still remain under the control of these capitalist providers.” (Dyer-Witheford, Kjøsen and Steinhoff 56)
Advances under the frame of democratization can thus be understood as measures to ensure for company-owned products to become “part of the general conditions of production”, serving as “source of robust no-cost programming, a potential recruitment ground, and a strategic site for attracting users to their platforms.” (Dyer-Witheford, Kjøsen and Steinhoff 54) Or, as the authors state at another instance: “If AI becomes generally available, it will still remain under the control of these capitalist providers.” (ibid. 56)


Against the background of this corporate dominance, Pieter Verdegem underlines the importance of current AI ethics debates as outlined in the introduction, but makes the plead particularly for a “radical democratization of AI” which not only entails accessibility to everyone, but particularly takes the political and economic dimensions of the AI industry into account. Facing “a situation whereby only a few organisations, whether governmental or corporate, have the economic and political power to decide what type of AI will be developed and what purposes it will serve” (Verdegem, “Introduction” 12), Verdegem demands “a digital infrastructure that is available to and provides advantages for a broad range of stakeholders in society, not just the AI behemoths.” (Verdegem, “Dismantling AI Capitalism” 8)  
Against the background of this corporate dominance, Pieter Verdegem underlines the importance of current AI ethics debates as outlined in the introduction, but pleads particularly for a “radical democratization of AI” which not only entails accessibility to everyone, but takes the political and economic dimensions of the AI industry into account. Facing “a situation whereby only a few organisations, whether governmental or corporate, have the economic and political power to decide what type of AI will be developed and what purposes it will serve” (Verdegem, “Introduction” 12), Verdegem demands “a digital infrastructure that is available to and provides advantages for a broad range of stakeholders in society, not just the AI behemoths.” (Verdegem, “Dismantling AI Capitalism” 8)  


In the following, I will thus shift the attention to Hugging Face, an AI company that particularly centers the ‘community’ around its endeavors and analyze it against the background of these demands.
In the following, I will thus shift the attention to Hugging Face, an AI company that particularly centers the ‘community’ around its endeavors and analyze it against the background of these demands.


== Community-centric AI: Hugging Face as alternative to big tech? ==
== Community-centric AI: Hugging Face as alternative to big tech? ==
Hugging Face is a New York-based AI company founded in 2016 by Clement Delangue, Julien Chaumond and Thomas Wolf. Originally, Hugging Face started out as a chatbot app for teenagers (Dillet). After positive responses for open-sourcing the models the chatbot was built on, the company moved to become a platform provider for open-source ML technologies (Osman and Sewell). Hugging Face is funded by 26 different investors and has raised $ 160,2 million in funding at the date of May 9, 2022 (Crunchbase). More than 5.000 organizations are using its models, including companies such as Meta AI, Google AI, Intel and Microsoft (Hugging Face, ''Official Website''). The company has also been listed in Forbes “AI 50 list” in 2022, which “recognizes standouts in privately-held North American companies making the most interesting and effective use of artificial intelligence technology.” (Popkin, Ohnsman and Cai)  
Hugging Face is a New York-based AI company founded in 2016 by Clement Delangue, Julien Chaumond and Thomas Wolf. Originally, Hugging Face started out as a chatbot app for teenagers (Dillet). After positive responses for open-sourcing the models the chatbot was built on, the company moved to become a platform provider for open-source ML technologies (Osman and Sewell). Hugging Face is funded by 26 different investors and has raised $ 160,2 million in funding at the date of May 9, 2022 (Crunchbase). More than 5.000 organizations are using its models, including companies such as Meta AI, Google AI, Intel and Microsoft (Hugging Face, ''Official Website''). The company has also been listed in Forbes “AI 50 list” in 2022, which “recognizes standouts in privately-held North American companies making the most interesting and effective use of artificial intelligence technology.” (Popkin, Ohnsman and Cai)
 
On its website, Hugging Face displays itself as “the AI community building the future.” (Hugging Face, ''Official Website'') In an interview, founder Delangue elaborates:
 
<blockquote>“Just as science has always operated by making the field open and collaborative, we believe there’s a big risk of keeping machine learning power very concentrated in the hands of a few players, especially when these players haven’t had a track record of doing the right thing for the community. By building more openly and collaboratively within the ecosystem, we can make machine learning a positive technology for everyone and work on some short-term challenges that we are seeing.” (Goldman)</blockquote>


On its website, Hugging Face displays itself as “the AI community building the future.” (Hugging Face, ''Official Website'') In an interview, founder Delangue elaborates:<blockquote>“Just as science has always operated by making the field open and collaborative, we believe there’s a big risk of keeping machine learning power very concentrated in the hands of a few players, especially when these players haven’t had a track record of doing the right thing for the community. By building more openly and collaboratively within the ecosystem, we can make machine learning a positive technology for everyone and work on some short-term challenges that we are seeing.” (Goldman)</blockquote>As we can see, Hugging Face follows very similar narratives to those advanced by big tech companies: first, the belief of social progress advanced by AI from which everyone should benefit, and second, the need for collaboration when it comes to the development of AI systems. However, they explicitly demand to counter the present concentration of power in the AI industry. In focus of their approach thus stands the desire to open-source models previously guarded by bigger players – particularly large-language models, which are computationally intensive and not easily reproducible – in order to let everyone take part in the development of AI. As they state: “No single company, including the Tech Titans, will be able to ‘solve AI’ by themselves – the only way we’ll achieve this is by sharing knowledge and resources in a community-centric approach.” (Hugging Face, “Hugging Face Hub Documentation”)
As we can see, Hugging Face follows very similar narratives to those advanced by big tech companies: first, the belief of social progress advanced by AI from which everyone should benefit, and second, the need for collaboration when it comes to the development of AI systems. However, they explicitly demand to counter the present concentration of power in the AI industry. In focus of their approach thus stands the desire to open-source models previously guarded by bigger players – particularly large-language models, which are computationally intensive and not easily reproducible – in order to let everyone take part in the development of AI. As they state: “No single company, including the Tech Titans, will be able to ‘solve AI’ by themselves – the only way we’ll achieve this is by sharing knowledge and resources in a community-centric approach.” (Hugging Face, “Hugging Face Hub Documentation”)


In order to do so, Hugging Face offers an open-source library with “more than 100.000 machine learning models […], enabling others in turn to use those pretrained models for their own AI projects instead of having to build models from scratch.” (Popkin, Ohnsman and Cai) Moreover, Hugging Face is not only a model library, but – taking the developer platform GitHub as role model – acts as a platform: on the ‘Hugging Face Hub’, developers can store code and training data sets and “people can easily collaborate and build ML together” (Hugging Face, “Hugging Face Hub Documentation”).
In order to do so, Hugging Face offers an open-source library with “more than 100.000 machine learning models […], enabling others in turn to use those pretrained models for their own AI projects instead of having to build models from scratch.” (Popkin, Ohnsman and Cai) Moreover, Hugging Face is not only a model library, but – taking the developer platform GitHub as role model – acts as a platform: on the ‘Hugging Face Hub’, developers can store code and training data sets, but also “easily collaborate and build ML together” (Hugging Face, “Hugging Face Hub Documentation”).


Given their explicit focus on community-centered approaches and their explicit stance against AI monopolization, the company has the appearance of meeting the demands outlined by media researchers above. However, against the background of the company recently announcing its cooperation with Amazon Web Services (AWS), it seems that they, too, are deeply integrated into the economically-driven ML ecosystem. Against the background of significant progress in the area of generative AI models (such as in text, audio or visual creation), which are generally proprietary and thus not publicly accessible, Hugging Face and AWS have declared a “long-term strategic partnership”, which is to “accelerate the availability of next-generation machine learning models by making them more accessible to the machine learning community and helping developers achieve the highest performance at the lowest cost.” (Boudier, Schmid and Simon) Specifically, this means that Hugging Face dedicates itself to AWS as main cloud provider, so that users of Hugging Face are facilitated to move between their platform and Amazon’s ML platform SageMaker, which is hosted on AWS and offers advanced cloud computing power (Bathgate). And also vice versa, customers of AWS will be provided with Hugging Face models on Amazon’s platform.  
Given their explicit focus on community-centered approaches and their explicit stance against AI monopolization, the company seems to meet the demands outlined by media researchers above. However, against the background of the company recently announcing its cooperation with Amazon Web Services (AWS), it seems that they, too, are deeply integrated into the economically driven ML ecosystem. Against the background of significant progress in the area of generative AI models (such as in text, audio or visual creation), which are generally proprietary and thus not publicly accessible, Hugging Face and AWS have declared a “long-term strategic partnership”, which is to “accelerate the availability of next-generation machine learning models by making them more accessible to the machine learning community and helping developers achieve the highest performance at the lowest cost.” (Boudier, Schmid and Simon) Specifically, this means that Hugging Face dedicates itself to AWS as main cloud provider, so that users of Hugging Face are facilitated to move between their platform and Amazon’s ML platform SageMaker, which is hosted on AWS and offers advanced cloud computing power (Bathgate). And also vice versa, customers of AWS will be provided with Hugging Face models on Amazon’s platform.  


Consequently, Hugging Face, too, while taking up the banner of democratization, principally acts within an economic context. Also a view on their business model provides more insights in this direction: While Hugging Face does provide its core technologies open-source and cost-free, there are several additional features that come at a price and which are organized around subscriptions and consumption-based plans (Osman and Sewell). Here, Hugging Face’s paying costumers comprise mostly big corporations, “seeking expert support, additional security, autotrain features, private cloud, SaaS, and on-premise model hosting” (Osman and Sewell).  
Consequently, Hugging Face, too, while taking up the banner of democratization, principally acts within an economic context. A look at their business model provides further insight in this direction: While Hugging Face does offer its core technologies open-source and cost-free, there are several additional features that come at a price and which are organized around subscriptions and consumption-based plans (Osman and Sewell). Here, Hugging Face’s paying costumers comprise mostly big corporations, “seeking expert support, additional security, autotrain features, private cloud, SaaS, and on-premise model hosting” (Osman and Sewell).  


In this sense, it appears that it becomes increasingly hard to not only create alternative discourses around AI technologies, but also to provide sustainable alternatives that operate outside of big tech’s domain, given the difficulty to reproduce the necessary infrastructures.
In this sense, it seems as if it becomes increasingly difficult not only to create alternative discourses around AI technologies, but also to provide sustainable alternatives that operate outside of big tech’s domain, given the challenge to reproduce the necessary infrastructures.


== So what does a AI democratization look like? Taking up a minor perspective ==
== So what might AI democratization look like? Taking up a minor perspective ==
If we consider AI technologies and their platforms not as an isolated phenomenon, but rather as another point in the genealogy of the commercialization of digital technologies by big tech companies, we might be able to take up a different perspective on their democratization.
AI technologies and their platforms are not an isolated phenomenon, but can rather be regarded as another point in the genealogy of the commercialization of digital technologies by big tech companies – in this regard, also their democratization needs to be regarded critically.  


Already with the emergence of early net cultures in the 1990s, both in the form of the Californian Ideology in the US as well as the emerging net critique in Europe at the same time, there was a profound belief in the ability of technology to enhance collectivity and collaboration (Apprich, ''Technotopia'' 45). In this sense, Felix Stalder describes how in the early phase of the internet’s emergence, participation not solely meant the contribution of content, but being part of the growing project as a whole, “determining the directions, rules and enabling infrastructures of one’s own actions in a collective, participatory process.” (Stalder, “Partizipation” 221, own translation) However, in the subsequent phase of commercialisation and the emergence of Web 2.0, those “core concepts of the first internet generation – communication, participation, openness to new things […] – [were made] suitable for the masses”, turning ‘participation’ into “user-generated content” (Stalder, “Partizipation” 223, own translation).
As elaborated earlier on, already with the emergence of net cultures in the 1990s, there was a profound belief in the ability of technology to enhance collectivity and collaboration forwarded by the Silicon Valley (Apprich, ''Technotopia'' 45). At the same time, however, there was an emerging net critique in Europe which also believed in the potential of the new media technologies, but explicitly opposed the US-based Californian Ideology (ibid. 35). For its advocates, participation not solely meant the contribution of content to the emerging social networks, but being part of the growing project as a whole, “determining the directions, rules and enabling infrastructures of one’s own actions in a collective, participatory process.” (Stalder, “Partizipation” 221, own translation) It was then in the subsequent phase of commercialisation and the emergence of Web 2.0 that those “core concepts of the first internet generation – communication, participation, openness to new things […] – [were made] suitable for the masses”, turning ‘participation’ into “user-generated content” (ibid. 223, own translation).  


Consequently, while digital media technologies were becoming generally available, “the infrastructures behind these tools [got] increasingly concentrated in the hands of a few, private corporations.” (Apprich, ''Technotopia'' 146) And even though their platformization often-times simplified their use (van Dijck, ''Cultures of Connectivity'' 6), it was the participation in their design that was closed off in favour of the streamlining and commercialization of user behavior.  
Consequently, while digital media technologies were becoming generally available, “the infrastructures behind these tools [got] increasingly concentrated in the hands of a few, private corporations.” (Apprich, ''Technotopia'' 146) And even though their platformization often-times simplified their use (van Dijck, ''Cultures of Connectivity'' 6), it was the participation in their design that was closed off in favor of the streamlining and commercialization of user behavior.


In light of the Californian ideology that dominated in the 1990s, Clemens Apprich considers the instrumentalization of technology as core problematic, which hinders escaping a capitalist logic: <blockquote>“The problem with this is that technology is not being recognised in its own logic, but rather seen as a means for something else – typically the liberation of the individual from the constraints of society. So, instead of acknowledging the socio-technical potential within it, technology is submitted to a communitarian thinking, which is predominantly defined by capitalist economy.” (Apprich, ''Technotopia'' 144) </blockquote>In this regard, values such as the commons and open access to resources, following Apprich, are not able to escape the capitalist background against which platforms such as Facebook and Google function today and are seamlessly appropriated in their discourses (Apprich, ''Technotopia'' 144).
With regard to the Californian Ideology, Clemens Apprich considers the instrumentalization of technology as core problematic, which hinders escaping a capitalist logic:


These dynamics are very similar to what we can see with AI democratization: both on the side of demands for a community-led AI as well as on the side of big tech, we can recognize not only a wish to make AI accessible for all, but also the belief that “bringing the benefits of AI to everyone” (Google AI, ''Official Website'') will lead to social progress. 
<blockquote>“The problem with this is that technology is not being recognised in its own logic, but rather seen as a means for something else – typically the liberation of the individual from the constraints of society. So, instead of acknowledging the socio-technical potential within it, technology is submitted to a communitarian thinking, which is predominantly defined by capitalist economy.” (''Technotopia'' 144)</blockquote>


As we have seen, particularly in the big tech discourse, this serves economic rationales. While generally a domain reserved for technical experts, under the frame of AI democratization, machine learning is commodified into a form that is easily executable. However, it is no true participation – or democratization – that is enacted here. Rather, the notion of democratization is used as forefront for the establishment of corporate products for AI development as well as free labor via the tasks developers perform on openly accessible corporate frameworks. Moreover, similar to how platform companies today dominate how we perform search, consume content online or how interact with friends and family, so do AI technologies become gradually platformized, with big tech companies such as Google, Microsoft and Amazon competing to become the monopoly provider.  
As we have seen, these dynamics are very similar to the discourse around AI democratization: both on the side of demands for a community-led AI as well as on the side of big tech, we can recognize not only a wish to make AI accessible for all, but also the belief that “bringing the benefits of AI to everyone” (Google AI, ''Official Website'') will lead to social progress. And particularly in the big tech discourse, this serves economic rationales. While generally a domain reserved for technical experts, under the frame of AI democratization, machine learning is commodified into a form that is easily executable. However, it is not true participation – or democratization – that is enacted here. Rather, the notion of democratization is used as forefront for the establishment of corporate products for AI development as well as free labor via the tasks developers perform on openly accessible corporate frameworks. Moreover, similar to how platform companies today dominate how we perform search, consume content online or how interact with friends and family, so do AI technologies become gradually platformized, with big tech companies such as Google, Microsoft and Amazon competing to become the monopoly provider. At the same time, demanding the integration of community-based organizations and counternarratives to these economic rationales proves increasingly difficult given the dominance big tech has already manifested in the AI industry – materially and discursively. 


At the same time, demanding the integration of community-based organizations and counternarratives to these economic rationales proves increasingly difficult given the dominance big tech has already manifested in the AI industry. Moreover, the industry is strongly shaping the discourse around AI technologies and their potential – and it is exactly the narratives that these organizations advance, that big tech has succeeded to appropriate in their strive for expansion.
What we consequently need to do is go beyond the notion of ‘access’ as sole condition for participation. At this point, we might again take as model those 1990s net cultures that Apprich compellingly describes in his search for alternative imaginaries:


What, then, can we still oppose? Instead of merely viewing technologies as tools that need to change the hands of people, we need to consider “how we can understand technical objects themselves as carriers and not just products or manifestations of knowledge” (Rieder 108). Technological systems have “evolved in a holistic infrastructure made up of both technologies and epistemologies” (Halpern et al., “Surplus Data” 199). And these epistemologies, as we can see, have been shaped for decades by big tech rationales. In this sense, notions such as scalability are deeply integrated into the practice of machine learning itself, which requires large amounts of data as well as high computing power. Its application is dominated by values such as universal applicability, efficiency and simplicity (Luchs, Apprich and Broersma; Heuer, Jarke and Breiter). And also its infrastructures are constructed as “uniform blocks, ready for further expansion” (Tsing 505), as we can see in their attempts to draw users to their platforms and to extend the reach of their products, which are already extremely hard to escape.
<blockquote>“In Europe, but also in the United States and elsewhere, non-commercial Internet Providers (e.g. Backspace, Centre for Culture & Communication, De Digitale Stad, Internationale Stadt, Ljudmila, Silver Server, Public Netbase, The Thing, XS4ALL) did not only offer Internet access, but also a platform for the self-determine use of new media technologies. The idea was to position net critique at the centre of action and to open up spaces of creation and experimentation […].” (''Technotopia'' 37)</blockquote>


We therefore need to reflect on how we can conceptualize a radically different approach to the creation of ML systems which breaks with the capitalist values that have been nourished for decades and that are deeply intertwined with ML research, development and education – but also, how we can enable a relationship with AI technologies that does not include a mere execution of corporate products, but rather a true participation in their design.
Related to the application and development of AI, its democratization should equally mean not only the general availability of technology in the form of its material resources, but also a deeper understanding of and engagement with AI, which means challenging the existing power structures within the industry, but also confronting the inner logics of the technologies. Concepts such as scalability are deeply integrated into the practice of machine learning itself, which requires large amounts of data and high computational power; values such as universal applicability, efficiency, and simplicity dominate its everyday use (Luchs, Apprich, and Broersma); and AI infrastructures are constructed as “uniform blocks ready for further expansion” (Tsing 505), as we can see from their attempts to attract users to their platforms and to expand the reach of their products (which are already extremely difficult to escape). We need to reflect on how we can conceptualize a radically different approach to the creation of ML systems which breaks with these capitalist values that have been nourished for decades and that are deeply intertwined with ML research, development and education – but also, how we can enable a relationship with AI technologies that does not include a mere execution of corporate products, but rather a true participation in their design.


One of the key beliefs of the proponents of big tech is, as Joichi Ito states, “that the world is ‘knowable’ and computationally simulatable, and that computers will be able to process the messiness of the real world just like they have every other problem that everyone said couldn’t be solved by computers.” (Ito 4) Instead, he poses, “[w]e need to embrace the unknowability – the irreducibility – of the real world […].” (Ito 6) One way to conceive of an alternative perspective might thus be to follow a ‘nonscalability theory’ as “alternative for conceptualizing the world” which “pays attention to the mounting pile of ruins that scalability leaves behind” (Tsing 507). For machine learning, this could mean to acknowledge the limitations that it poses – concerning the messiness of reality and the impossibility of lossless translation, but also the messiness of the ML process itself, dealing with dirty data and the political notion of discrimination (Apprich, “Introduction”; Steyerl, “A Sea of Data”).  
One of the key beliefs of the proponents of big tech, as Joichi Ito states, is “that the world is ‘knowable’ and computationally simulatable, and that computers will be able to process the messiness of the real world just like they have every other problem that everyone said couldn’t be solved by computers.” (4) Instead, he poses, “[w]e need to embrace the unknowability – the irreducibility – of the real world […].” (ibid. 6) One way to conceive of an alternative perspective might thus be to follow a ‘nonscalability theory’ as “alternative for conceptualizing the world” which “pays attention to the mounting pile of ruins that scalability leaves behind” (Tsing 507). For machine learning, this could mean to acknowledge the limitations that it poses – concerning the messiness of reality and the impossibility of lossless translation, but also the messiness of the ML process itself, dealing with dirty data and the political notion of discrimination (Apprich, “Introduction”; Steyerl, “A Sea of Data”).  


But also in practically engaging with the technology – in learning to do machine learning and in interacting with its platforms, libraries and datasets – we need to strive for critical practices. We should oppose big tech’s tendency to hide away ML operations behind obfuscating interfaces that we are the users of, and look behind them in order to gain a deeper understanding of the technical operations and to acknowledge their embeddedness in our world. In fully understanding this condition, we sooner or later need to ask: is machine learning the best possible way to do data filtering and classification – or might we rather seek for other technological means that are not intrinsically built on notions of scalability?
But also in practically engaging with the technology – in learning to do machine learning and in interacting with its platforms, libraries and datasets – we need to strive for critical practices. We should oppose big tech’s tendency to hide away ML operations behind obfuscating interfaces that we are the users of and look behind them in order to gain a deeper understanding of the technical operations and to acknowledge their embeddedness in our world. In fully understanding this condition, we sooner or later need to ask: is machine learning the best possible way to do data filtering and classification – or might we rather seek for other technological means?


== Acknowledgements ==
== Acknowledgements ==
I would like to thank the everyone who participated in the APRJA workshop “Towards a Minor Tech” as well as the members of the Groningen “Data Infrastructures and Algorithmic Practices” research group for their extensive feedback on previous drafts as well as for the inspiring conversations.
I would like to thank the participants of the APRJA workshop “Towards a Minor Tech”, the members of the Groningen “Data Infrastructures and Algorithmic Practices” research group and the anonymous peer reviewer for their extensive feedback on previous drafts as well as for the inspiring conversations.


== Works Cited ==
== Works cited ==
<div class="workscited">
Apprich, Clemens. ''Technotopia. A Media Genealogy of Net Cultures.'' Rowman & Littlefield, 2017.
Apprich, Clemens. ''Technotopia. A Media Genealogy of Net Cultures.'' Rowman & Littlefield, 2017.


Apprich, Clemens. “Introduction.” ''Pattern Discrimination'', edited by Clemens Apprich, Wendy Hui Kyong Chun, Florian Cramer and Hito Steyerl. meson press/Minnesota Press, 2018, pp. ix-xii.
Apprich, Clemens. “Introduction.” ''Pattern Discrimination'', edited by Clemens Apprich, Wendy Hui Kyong Chun, Florian Cramer and Hito Steyerl. meson press/Minnesota Press, 2018, pp. ix-xii.


Bathgate, Rory. “AWS and Hugging Face partner to ‘democratise’ ML, AI models.” ''ITPro'', 22 February 2023. <nowiki>https://www.itpro.co.uk/cloud/370113/aws-and-hugging-face-partner-to-democratise-ml-ai-models</nowiki>.  
Barbrook, Richard and Andy Cameron. “The Californian Ideology.” ''Mute'' vol. 1, no. 3, September 1995. https://www.metamute.org/editorial/articles/californian-ideology.


Benjamin, Ruha ''Race after Technology. Abolitionist Tools for the New Jim Code.'' Polity Press, 2019.
Bathgate, Rory. “AWS and Hugging Face Partner to ‘Democratise’ ML, AI Models.” ''ITPro'', 22 February 2023. https://www.itpro.co.uk/cloud/370113/aws-and-hugging-face-partner-to-democratise-ml-ai-models.  


Boudier, Jeff, Schmid, Philipp and Simon, Julien. “Hugging Face and AWS partner to make AI more accessible.''Hugging Face Blog'', 21 February 2023. <nowiki>https://huggingface.co/blog/aws-partnership</nowiki>.
Benjamin, Ruha. ''Race after Technology. Abolitionist Tools for the New Jim Code.'' Polity Press, 2019.


Burkhardt, Marcus. “Mapping the Democratization of AI on GitHub. A First Approach.” ''The Democratization of Artificial Intelligence. Net Politics in the Era of Learning Algorithms'', edited by Andreas Sudman. transcript, 2019, pp. 209-221''.''
Boudier, Jeff, Schmid, Philipp and Julien Simon. “Hugging Face and AWS Partner to Make AI More Accessible.” ''Hugging Face Blog'', 21 February 2023. https://huggingface.co/blog/aws-partnership.


Costanza-Chock, Sasha. “Design justice: Towards an intersectional feminist framework for design theory and practice.” ''Proceedings of the Design Research Society'', 2018.
Burkhardt, Marcus. “Mapping the Democratization of AI on GitHub. A First Approach.” ''The Democratization of Artificial Intelligence. Net Politics in the Era of Learning Algorithms'', edited by Andreas Sudmann. transcript, 2019, pp. 209-221.  


Crunchbase. “Hugging Face.” 2023. <nowiki>https://www.crunchbase.com/organization/hugging-face</nowiki>.
Costanza-Chock, Sasha. “Design Justice: Towards an Intersectional Feminist Framework for Design Theory and Practice.” ''Proceedings of the Design Research Society'', 2018.


D’Ignazio, Catherine and Klein, Lauren F. ''Data Feminism''. MIT Press, 2020.
Crunchbase. “Hugging Face.” 2023. https://www.crunchbase.com/organization/hugging-face.


Dillet, Romain. “Hugging Face wants to become your artificial BFF.''TechCrunch'', 9 March 2017. <nowiki>https://techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/</nowiki>.
D’Ignazio, Catherine and Lauren F. Klein. ''Data Feminism''. The MIT Press, 2020.


Dyer-Witheford, Nick, Kjøsen, Atle Mikkola and Steinhoff, James. ''Inhuman Power. Artificial Intelligence and the Future of Capitalism''. Pluto Press, 2019.
Dillet, Romain. “Hugging Face Wants to Become Your Artificial BFF.” ''TechCrunch'', 9 March 2017. https://techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/.
 
Dyer-Witheford, Nick, Kjøsen, Atle Mikkola and James Steinhoff. ''Inhuman Power. Artificial Intelligence and the Future of Capitalism''. Pluto Press, 2019.


Eubanks, Virginia. ''Automating Inequality''. St. Martin’s Press, 2017.
Eubanks, Virginia. ''Automating Inequality''. St. Martin’s Press, 2017.


Goldman, Sharon. “Fresh off $2B valuation, ML platform Hugging Face touts ‘open and collaborative approach’.''Venture Beat'', 9 May 2022. <nowiki>https://venturebeat.com/ai/fresh-off-2b-valuation-machine-learning-platform-hugging-face-highlights-open-and-collaborative-approach/</nowiki>.
Goldman, Sharon. “Fresh Off $2B Valuation, ML Platform Hugging Face Touts ‘Open and Collaborative Approach.’” ''Venture Beat'', 9 May 2022. https://venturebeat.com/ai/fresh-off-2b-valuation-machine-learning-platform-hugging-face-highlights-open-and-collaborative-approach/.


Google AI. ''Official Website''. 2023. <nowiki>https://ai.google/</nowiki>.
Google AI. ''Official Website''. 2023. https://ai.google/.


Google AI. “Principles.” ''Google AI Official Website'', 2023. <nowiki>https://ai.google/principles</nowiki>.
Google AI. “Principles.” ''Google AI Official Website'', 2023. https://ai.google/principles<.


Google AI. “Tools.” ''Official Website'', 2023. <nowiki>https://ai.google/tools</nowiki>.
Google AI. “Tools.” ''Official Website'', 2023. https://ai.google/tools.


Google Developers. “Google I/O Keynote (Google I/O '17).” Uploaded on 17 May 2017. <nowiki>https://www.youtube.com/watch?v=Y2VF8tmLFHw</nowiki>.
Google Developers. “Google I/O Keynote (Google I/O '17).” Uploaded on 17 May 2017. https://www.youtube.com/watch?v=Y2VF8tmLFHw.


Halpern, Orit, Jagoda, Patrick, West Kirkwood, Jeffrey and Weatherby, Leif. “Surplus Data: An Introduction.” ''Critical Inquiry'', vol. 48, no. 2, 2022.  
Hugging Face. ''Official Website'', 2023. https://huggingface.co.


Heuer, Hendrik, Jarke, Juliane and Breiter, Andreas. “Machine Learning in Tutorials – Universal Applicability, Underinformed Application, and Other Misconceptions”. ''Big Data & Society'', vol. 8, no. 1, 2021.  
Hugging Face. “Hugging Face Hub Documentation.''Official Website'', 2023. https://huggingface.co/docs/hub/index.


Hugging Face. ''Official Website'', 2023. <nowiki>https://huggingface.co</nowiki>.
Ito, Joichi. “Resisting Reduction: A Manifesto.''Journal of Design and Science'', 2018


Hugging Face. “Hugging Face Hub Documentation.” ''Official Website'', 2023. <nowiki>https://huggingface.co/docs/hub/index</nowiki>.
Kratz, Jeff. “Accelerating and democratizing research with the AWS Cloud.” ''AWS Public Sector Blog'', 14 September 2022. https://aws.amazon.com/blogs/publicsector/accelerating-democratizing-research-aws-cloud/.


Ito, Joichi. “Resisting Reduction: A Manifesto.” ''Journal of Design and Science'', 2018 10.21428/8f7503e4.
Luchs, Inga, Apprich, Clemens and Marcel Broersma. “Learning Machine Learning. On the Political Economy of AI Online Courses.” ''Big Data & Society'', vol. 10, no. 1, 2023.  


Kratz, Jeff. “Accelerating and democratizing research with the AWS Cloud.” ''AWS Public Sector Blog'', 14 September 2022. <nowiki>https://aws.amazon.com/blogs/publicsector/accelerating-democratizing-research-aws-cloud/</nowiki>.
Meta AI. “About.” ''Official Website,'' 2023. https://ai.facebook.com/about/.


Luchs, Inga, Apprich, Clemens and Broersma, Marcel. “Learning Machine Learning. On the Political Economy of AI Online Courses.” ''Big Data & Society'', vol. 10, no. 1, 2023.  
Metz, Cade. “Google Just Open Sourced TensorFlow, Its Artificial Intelligence Engine.” ''WIRED'', 9 November 2015. https://www.wired.com/2015/11/google-open-sources-its-artificial-intelligence-engine/.


Metz, Cade. “Google Just Open Sourced TensorFlow, Its Artificial Intelligence Engine.” ''WIRED'', 9 November 2015. <nowiki>https://www.wired.com/2015/11/google-open-sources-its-artificial-intelligence-engine/</nowiki>.
Microsoft News Center. “Democratizing AI. For Every Person and Every Organization.” ''Microsoft Features'', 26 September 2016. https://news.microsoft.com/features/democratizing-ai/.


Microsoft News Center. “Democratizing AI. For every person and every organization.''Microsoft Features'', 26 September 2016. <nowiki>https://news.microsoft.com/features/democratizing-ai/</nowiki>
O’Neil, Cathy. ''Weapons of Math Destruction''. Penguin Random House, 2016.


O'Neil, Cathy. ''Weapons of Math Destruction''. Penguin Random House, 2016.
Osman, Luqman and Dawson Sewell. “Hugging Face.” ''Contrary Research'', 14 September 2022. https://research.contrary.com/reports/hugging-face.


Popkin, Helen A.S., Ohsman, Alan and Cai, Kenrik. “The AI 50.” ''Forbes'', 9 May 2022. <nowiki>https://www.forbes.com/lists/ai50/?sh=73ac1959290f</nowiki>.
Popkin, Helen A.S., Ohsman, Alan and Kenrik Cai. “The AI 50.” ''Forbes'', 9 May 2022. https://www.forbes.com/lists/ai50/?sh=73ac1959290f.


Osman, Luqman and Sewell, Dawson. “Hugging Face.''Contrary Research'', 14 September 2022. <nowiki>https://research.contrary.com/reports/hugging-face</nowiki>.
Raymond, Eric S. ''The Cathedral and the Bazaar''. ''Musings on Linux and Open Source by an Accidental Revolutionary.'' O’Reilly, 2001.


Rieder, Bernhard. ''Engines of Order. A Mechanology of Algorithmic Techniques.'' Amsterdam University Press, 2020.
Srnicek, Nick. “The Political Economy of Artificial Intelligence.” Recorded talk, ''Great Transformation'', Jena, 23.09.-27.09.2019. https://www.youtube.com/watch?v=Fmi3fq3Q3Bo.


Srnicek, Nick. “The Political Economy of Artificial Intelligence.” Recorded talk, ''Great Transformation'', Jena, 23.09.-27.09.2019. <nowiki>https://www.youtube.com/</nowiki>
Srnicek, Nick. “Data, Compute, Labor.” ''Digital Work in the Planetary Market,'' edited by Mark Graham and Fabian Ferrari. The MIT Press, 2022, pp. 241-262.
 
watch?v=Fmi3fq3Q3Bo.


Stalder, Felix. “Partizipation. Von der Teilhabe zum Spektakel du zurück.” ''Vergessene Zukunft. Radikale Netzkulturen in Europa'', edited by Clemens Apprich and Felix Stalder. transcript, 2012, pp. 219-225.
Stalder, Felix. “Partizipation. Von der Teilhabe zum Spektakel du zurück.” ''Vergessene Zukunft. Radikale Netzkulturen in Europa'', edited by Clemens Apprich and Felix Stalder. transcript, 2012, pp. 219-225.
Line 147: Line 163:
Steyerl, Hito. “A Sea of Data: Pattern Recognition and Corporate Animism (Forked Version)”. ''Pattern Discrimination'', edited by Clemens Apprich, Wendy Hui Kyong Chun, Florian Cramer and Hito Steyerl. meson press/Minnesota Press, 2018, pp. 1-22  
Steyerl, Hito. “A Sea of Data: Pattern Recognition and Corporate Animism (Forked Version)”. ''Pattern Discrimination'', edited by Clemens Apprich, Wendy Hui Kyong Chun, Florian Cramer and Hito Steyerl. meson press/Minnesota Press, 2018, pp. 1-22  


Sudmann, Andreas. ''The Democratization of Artificial Intelligence. Net Politics in the Era of Learning Algorithms''. Transcript, 2019.
Sudmann, Andreas. ''The Democratization of Artificial Intelligence. Net Politics in the Era of Learning Algorithms''. transcript, 2019.


Tsing, Anna. “On Nonscalability: The Living World is not Amenable to Precision-Nested Scales.” ''Common Knowledge'', vol. 18, no. 3, 2012, pp. 505-524.
Tkacz, Nathaniel. ''Wikipedia and the Politics of Openness''. The University of Chicago Press, 2015.


van Dijck, José. ''The Culture of Connectivity. A Critical history of Social Media''. Oxford University Press, 2013.  
Tsing, Anna. “On Nonscalability: The Living World is Not Amenable to Precision-Nested Scales.” ''Common Knowledge'', vol. 18, no. 3, 2012, pp. 505-524.
 
Turner, Fred. ''From Counterculture to Cyberculture''. The University of Chicago Press, 2006.
 
Turner, Fred. “Machine Politics. The Rise of the Internet and a New Age of Authoritarianism.” ''Harper’s Magazine'', 2019. https://harpers.org/archive/2019/01/machine-politics-facebook-political-polarization/.
 
van Dijck, José. ''The Culture of Connectivity. A Critical History of Social Media''. Oxford University Press, 2013.  


Verdegem, Pieter. “Introduction: Why We Need Critical Perspectives on AI.” ''AI for Everyone? Critical Perspectives'', edited by Pieter Verdegem. University of Westminster Press, 2021, pp. 1-18.
Verdegem, Pieter. “Introduction: Why We Need Critical Perspectives on AI.” ''AI for Everyone? Critical Perspectives'', edited by Pieter Verdegem. University of Westminster Press, 2021, pp. 1-18.


Verdegem, Pieter. “Dismantling AI capitalism: The commons as an alternative to the power concentration of Big Tech.” ''AI & Society'', 9 April 2022.
Verdegem, Pieter. “Dismantling AI Capitalism: The Commons As an Alternative to the Power Concentration of Big Tech.” ''AI & Society'', 9 April 2022.
 
West, Sarah Myers, Whittaker, Meredith and Kate Crawford. ''Discriminating Systems. Gender, Race, and Power in AI''. AI Now Institute, 2019.
</div>


West, Sarah  Myers, Whittaker, Meredith and Crawford, Kate. ''Discriminating Systems. Gender, Race, and Power in AI''. AI Now Institute, 2019.
[[Category:Toward a Minor Tech]]
[[Category:5000 words]]

Latest revision as of 20:30, 21 July 2023

Inga Luchs

AI for All?
Challenging the Democratization of Machine Learning

AI for All? Challenging the Democratization of Machine Learning

Abstract

Research in artificial intelligence (AI) is heavily shaped by big tech today. In the US context, companies such as Google and Microsoft profit from a tremendous position of power due to their control over cloud computing, large data sets and AI talent. In light of this dominance, many media researchers and activists demand open infrastructures and community-led approaches to provide alternative perspectives – however, it is exactly this discourse that companies are appropriating for their expansion strategies. In recent years, big tech has taken up the narrative of democratizing AI by open-sourcing their machine learning (ML) tools, simplifying and automating the application of AI and offering free educational ML resources. The question that remains is how an alternative approach to ML infrastructures – and to the development of ML systems – can still be possible. What are the implications of big tech’s strive for infrastructural expansion under the umbrella of ‘democratization’? And what would a true democratization of ML entail? I will trace these two questions by critically examining, first, the open-source discourse advanced by big tech, as well as, second, the discourse around the AI open-source community Hugging Face that sees AI ethics and democratization at the heart of their endeavour. Lastly, I will show how ML algorithms need to be considered beyond their instrumental notion. It is thus not enough to simply hand over the technology to the community – we need to think about how we can conceptualize a radically different approach to the creation of ML systems.

Introduction

Machine learning (ML) has grown to be a central area of artificial intelligence in the last decades. Ranging from search engine queries, over the filtering of spam e-mails and the recommendation of books and movies to the detection of credit card fraud and predictive policing, applications that are based on ML algorithms are taking over the classification tasks of our everyday life. These algorithmic operations, however, cannot be separated from the cultural sphere in which they emerge. Consequently, they are not only mirroring biases already existing in society, but are further deepening them, consolidating race, class, and gender as immutable categories (Apprich, “Introduction”).

The research and development of AI and ML algorithms is heavily shaped by big technology companies. In the United States, for example, Google, Amazon, and Microsoft wield a great deal of power over the AI industry – because it is they who have the necessary cloud computing resources and data sets, but also the unique position to draw highly qualified AI talent (Dyer-Witheford, Kjøsen and Steinhoff 43). In addition, they are increasingly offering AI or ML ‘as a service’. This includes the offer of ready-to-use AI technologies that external companies can feed into their products, and moreover open-source access to their infrastructures for the training and development of ML models (Srnicek, “The Political Economy of Artificial Intelligence”).

With respect to the mentioned issues of algorithmic discrimination (O’Neil; Eubanks), the dominance of big tech in the development of ML is crucial because who is developing AI systems is significantly shaping how AI is imagined and developed – and these spaces “tend to be extremely white, affluent, technically oriented, and male.” (West et al. 6) Countering this problem, many critical media researchers plead for a participatory approach, including more diverse communities into the creation of AI systems (Costanza-Chock; Benjamin; D’Ignazio and Klein). In her book Race after Technology, for instance, Ruha Benjamin underlines that the development of AI systems must be guided by values other than economic interests and demands “a socially conscious approach to tech development that would require prioritizing equity over efficiency, social good over market imperatives.” (Benjamin 183) Further, following Benjamin, this re-design “cannot be limited to industry, nonprofit, and government actors, but must include community-based organizations that offer a vital set of counternarratives.” (Benjamin 188) According to the authors of the book Data Feminism, Catherine D’Ignazio and Lauren F. Klein, this includes a firm stance against the forms of technological solutionism often performed by big tech. In this sense, they call to tackle problems of algorithmic discrimination not as ‘technical bias’ of the system, but rather to “address the source of the bias: structural oppression.” (D’Ignazio and Klein 63) Consequently, this perspective “leads to fundamentally different decisions about what to work on, who to work with, and when to stand up and say that a problem cannot and should not be solved by data and technology.” (ibid.)  

The “Design Justice Network”, a collective consisting of designers, developers, researchers and activists, assembles these demands. Taking up Joichi Ito’s call for ‘participant design’, this network has come up with several principles that should guide technological development, focusing on the inclusion of communities currently marginalized by AI systems and favoring collaborative approaches by “shar[ing] design knowledge and tools”, in order to “work towards sustainable, community-led and -controlled outcomes” (Costanza Chock 11-12). At the same time, it is exactly this discourse that big tech companies have appropriated: they, too, aim to ‘democratize’ AI – which entails both distributing its benefits as well as its tools to everyone.

In this research essay, I will first outline the way big tech companies are utilizing the democratization discourse to their economic advantage, posing their ML infrastructures in a way that serves their expansion. Secondly, against the background of many media researchers’ call for ‘community-led practices’ in terms of AI systems, I will critically investigate the US-American AI company Hugging Face, which advertises a “community-centric approach”. Similar to the discourse around community-led AI, the company sees itself “on a journey to advance and democratize artificial intelligence through open source and open science”, decidedly opposing itself against big tech which has not had “a track record of doing the right thing for the community” (Goldman). In this regard, I aim to analyse what their notion of ‘democratization’ entails, particularly against the background of Hugging Face recently announcing its cooperation with Amazon Web Services (AWS).

While access to AI infrastructures and community-led AI development are certainly important, I will lastly show how ML algorithms need to be considered beyond their instrumental notion. It is thus not enough to simply hand over the technology to the community – we need to think about how we can conceptualize a radically different approach to the creation of ML systems. This particularly entails questioning the deeply capitalist notions along which ML and its infrastructures are currently developed, and how we might break with these values that have been nourished for decades and that are deeply intertwined with ML research, development and education.

Tools and benefits “for everyone”: Big tech’s AI democratization

In the last decade, US-based tech companies primarily known for their social media platforms, search engines or online marketplaces, increasingly centered their endeavors around artificial intelligence. In 2017, for instance, CEO Sundar Pichai reported at the yearly Google I/O conference that the company will be focusing on an “AI first approach” (Google Developers). From then on, Google has been explicitly working on the integration of ML technologies into their products, such as its search engine, its YouTube recommendation algorithm or its file hosting service Google Drive. Around the same time, the research department Google AI was established – and also other big tech companies set up, or further invested into, their own AI sections (see, for instance, IBM, Microsoft, and Meta). Next to the integration of AI into their applications, these companies have moreover started to offer their AI technologies themselves as a product, moving their companies into the heart of the AI industry (Srnicek, “Data, Compute, Labor” 242).

The corporate advances in the field of AI are accompanied by a discourse around ‘AI democratization’, which centers around the aim to make AI applications and infrastructures available to everyone. As Marcus Burkhardt details, this is targeted at users and developers:

“For developers this democratization entails the possibility to make use of AI in their own products and to partake in shaping the future of AI by having open or paid access to resources and services […]. Users on the other hand are enlisted in the democratization of AI as beneficiaries of technologies that are ‘infused’ with artificial intelligence and machine learning.” (211)

For the latter narrative, the companies closely link the advancement of AI with societal progress. Google AI’s mission, for instance, is to “create technologies that solve important problems and help people in their daily lives”, emphasizing the potential of AI to “empower people, widely benefit current and future generations, and work for the common good.” (Google AI, “Principles”) Microsoft underlines its aspiration to democratize AI “for every person and every organization”, grounded in the belief that the ‘essence’ of AI is “about helping everyone achieve more – humans and machines working together to make the world a better place.” (Microsoft News Center) And Meta, states as its goal “to build AI responsibly, for everyone” and is “advancing AI for a more connected world.” (Meta AI)

The former narrative concerns the democratization of AI development, which corresponds to the open access provision of infrastructures necessary to do machine learning (such as data sets, cloud storage and computing resources, but also frameworks and libraries). Google offers a whole section on its AI website titled “Tools for everyone.” Here, the company claims: “We’re making tools and resources available so that anyone can use technology to solve problems. Whether you’re just getting started or you’re already an expert, find the resources you need to reach your next breakthrough.” (Google AI, “Tools”) This includes access to its open-source machine learning platform TensorFlow, as well as to Google datasets, pre-trained models and other training resources.  Microsoft states: “At Microsoft, we have an approach […] that seeks to democratize Artificial Intelligence (AI), to take it from the ivory towers and make it accessible for all”, which includes the availability of their “intelligent capabilities […] to every application developer in the world.” (Microsoft News Center) And Amazon Web Services deploys its cloud as means to “accelerat[e] the pace of innovation, democratiz[e] access to data, and allow[] researchers and scientists to scale, work collaboratively, and make new discoveries from which we may all benefit.” (Kratz)

Furthermore, the companies aim to lower the barrier to AI development tools “so that even non-experts inside and outside companies and universities can increasingly use the corresponding technologies.” (Sudmann 23) This entails for instance a variety of educational resources on offer, in form of free ML introductory courses and training certificates which address not only experienced developers but also those that are looking for an entry point into ML development (Luchs, Apprich and Broersma). We can also notice a growing platformization of AI development tools, which leads to automatized and standardized forms of ML development and should facilitate anyone to develop ML systems without prerequisite knowledge (see, for instance, Google’s Vertex AI).

For both users of AI applications and their developers, the democratization of AI revolves primarily around the notion of access – and increasing the availability of AI technologies does indeed facilitate their democratization in some regard: it enables a wide accessibility of ML infrastructures, and it expands – to some extent – the circle of those who can use and develop AI technologies in the first place. Nevertheless, this discourse must be viewed critically and as part of a larger historical trajectory that goes back to the beginnings of network technologies, their promises, but also their commodification. In this sense, the narrative that access to technologies serves as empowering for individuals, and that this, further, leads to more democratic societies, is by no means new. On the contrary, it has been integral part of the Silicon Valley’s “Californian Ideology” (Barbrook and Cameron) since its early beginnings – as Fred Turner for instance shows in his book From Counterculture to Cyberculture (2006), where he traces the origins of this digital utopianism.

One illustrative example are the virtual communities that began to appear in the 1980s with the advent of personal computers and bulletin board systems. For the first time, users could communicate across local barriers and in real-time, which facilitated the forming of connections in new ways (Apprich, Technotopia 90). As a result, these virtual communities were seen as a glimpse of the promise to “dissolve social hierarchies and enable a self-government of emancipated citizens” (ibid. 91). It is this faith in the liberating power of network technologies – further manifested by the Internet emerging in the 1990s – that still shapes Silicon Valley up until today. Mark Zuckerberg, founder of Facebook, for instance, “envisions a world in which individuals, communities, and nations create an ideal social order through the constant exchange of information – that is, through staying ‘connected’” (Turner, “Machine Politics”). What is even more important, however, is that the companies of the Silicon Valley see themselves in the responsibility of providing the necessary infrastructures. It thus the same narrative – the view that new technologies are facilitators of social progress – which seamlessly fits into their capitalist aims and which “proved enormously profitable across Silicon Valley. By justifying the belief that for-profit systems are the best way to improve public live, it has helped turn the expression of individual experience into raw material that can be mined, processed, and sold.” (ibid.)

We can tell a similar story when it comes to software development. As Nathaniel Tkacz shows, two movements emerged in its initial years, which displayed “two competing mutations of liberalism” (24): The Free Software Movement initiated by Richard Stallman in the 1980s, which declared that all software should be ‘free’ in terms of usage, distribution and modification – and thus non-proprietary; and the Open Source Initiative, founded by Bruce Perens and Eric Raymond in 1998 (ibid. 21-23), which accounted for a liberalism that facilitated economic growth and innovation. In his popular writing on The Cathedral and the Bazaar (2001), Raymond elaborates that the development of software should not be centrally controlled (as in his notion of the cathedral), but rather as open as possible, allowing for a high degree of individual contributions (resembling a bazaar). At the same time, companies should be able to make use of the increased productivity by commodifying the results. Raymond’s bazaar thus centers around a market for “competing ‘agendas and ideas’; progress ‘at a speed barely imaginable’; and the miraculous emergence of a ‘coherent and stable system’” (Tkacz 24, cited after Raymond).

It is this economic line of thought that also dominates the AI industry today. Big tech companies have an evident economic interest in expanding the reach of their AI technologies and infrastructures. Hence, what is advertised as democratization must above all be viewed as expansion strategy, where users are positioned as customers of corporate products. By offering their infrastructures openly accessible, companies achieve that more developers are drawn to them, which makes the infrastructures more established in AI development generally. Further, by training new developers on their infrastructures, these become dependent on their products.  And – as we can see – the open-source discourse serves as means to drive ML research and to harness free contributions from the community, which, consequently, leads to further improvement of the corresponding AI technologies (Metz).

Advances under the frame of democratization can thus be understood as measures to ensure for company-owned products to become “part of the general conditions of production”, serving as “source of robust no-cost programming, a potential recruitment ground, and a strategic site for attracting users to their platforms.” (Dyer-Witheford, Kjøsen and Steinhoff 54) Or, as the authors state at another instance: “If AI becomes generally available, it will still remain under the control of these capitalist providers.” (ibid. 56)

Against the background of this corporate dominance, Pieter Verdegem underlines the importance of current AI ethics debates as outlined in the introduction, but pleads particularly for a “radical democratization of AI” which not only entails accessibility to everyone, but takes the political and economic dimensions of the AI industry into account. Facing “a situation whereby only a few organisations, whether governmental or corporate, have the economic and political power to decide what type of AI will be developed and what purposes it will serve” (Verdegem, “Introduction” 12), Verdegem demands “a digital infrastructure that is available to and provides advantages for a broad range of stakeholders in society, not just the AI behemoths.” (Verdegem, “Dismantling AI Capitalism” 8)

In the following, I will thus shift the attention to Hugging Face, an AI company that particularly centers the ‘community’ around its endeavors and analyze it against the background of these demands.

Community-centric AI: Hugging Face as alternative to big tech?

Hugging Face is a New York-based AI company founded in 2016 by Clement Delangue, Julien Chaumond and Thomas Wolf. Originally, Hugging Face started out as a chatbot app for teenagers (Dillet). After positive responses for open-sourcing the models the chatbot was built on, the company moved to become a platform provider for open-source ML technologies (Osman and Sewell). Hugging Face is funded by 26 different investors and has raised $ 160,2 million in funding at the date of May 9, 2022 (Crunchbase). More than 5.000 organizations are using its models, including companies such as Meta AI, Google AI, Intel and Microsoft (Hugging Face, Official Website). The company has also been listed in Forbes “AI 50 list” in 2022, which “recognizes standouts in privately-held North American companies making the most interesting and effective use of artificial intelligence technology.” (Popkin, Ohnsman and Cai)

On its website, Hugging Face displays itself as “the AI community building the future.” (Hugging Face, Official Website) In an interview, founder Delangue elaborates:

“Just as science has always operated by making the field open and collaborative, we believe there’s a big risk of keeping machine learning power very concentrated in the hands of a few players, especially when these players haven’t had a track record of doing the right thing for the community. By building more openly and collaboratively within the ecosystem, we can make machine learning a positive technology for everyone and work on some short-term challenges that we are seeing.” (Goldman)

As we can see, Hugging Face follows very similar narratives to those advanced by big tech companies: first, the belief of social progress advanced by AI from which everyone should benefit, and second, the need for collaboration when it comes to the development of AI systems. However, they explicitly demand to counter the present concentration of power in the AI industry. In focus of their approach thus stands the desire to open-source models previously guarded by bigger players – particularly large-language models, which are computationally intensive and not easily reproducible – in order to let everyone take part in the development of AI. As they state: “No single company, including the Tech Titans, will be able to ‘solve AI’ by themselves – the only way we’ll achieve this is by sharing knowledge and resources in a community-centric approach.” (Hugging Face, “Hugging Face Hub Documentation”)

In order to do so, Hugging Face offers an open-source library with “more than 100.000 machine learning models […], enabling others in turn to use those pretrained models for their own AI projects instead of having to build models from scratch.” (Popkin, Ohnsman and Cai) Moreover, Hugging Face is not only a model library, but – taking the developer platform GitHub as role model – acts as a platform: on the ‘Hugging Face Hub’, developers can store code and training data sets, but also “easily collaborate and build ML together” (Hugging Face, “Hugging Face Hub Documentation”).

Given their explicit focus on community-centered approaches and their explicit stance against AI monopolization, the company seems to meet the demands outlined by media researchers above. However, against the background of the company recently announcing its cooperation with Amazon Web Services (AWS), it seems that they, too, are deeply integrated into the economically driven ML ecosystem. Against the background of significant progress in the area of generative AI models (such as in text, audio or visual creation), which are generally proprietary and thus not publicly accessible, Hugging Face and AWS have declared a “long-term strategic partnership”, which is to “accelerate the availability of next-generation machine learning models by making them more accessible to the machine learning community and helping developers achieve the highest performance at the lowest cost.” (Boudier, Schmid and Simon) Specifically, this means that Hugging Face dedicates itself to AWS as main cloud provider, so that users of Hugging Face are facilitated to move between their platform and Amazon’s ML platform SageMaker, which is hosted on AWS and offers advanced cloud computing power (Bathgate). And also vice versa, customers of AWS will be provided with Hugging Face models on Amazon’s platform.

Consequently, Hugging Face, too, while taking up the banner of democratization, principally acts within an economic context. A look at their business model provides further insight in this direction: While Hugging Face does offer its core technologies open-source and cost-free, there are several additional features that come at a price and which are organized around subscriptions and consumption-based plans (Osman and Sewell). Here, Hugging Face’s paying costumers comprise mostly big corporations, “seeking expert support, additional security, autotrain features, private cloud, SaaS, and on-premise model hosting” (Osman and Sewell).

In this sense, it seems as if it becomes increasingly difficult not only to create alternative discourses around AI technologies, but also to provide sustainable alternatives that operate outside of big tech’s domain, given the challenge to reproduce the necessary infrastructures.

So what might AI democratization look like? Taking up a minor perspective

AI technologies and their platforms are not an isolated phenomenon, but can rather be regarded as another point in the genealogy of the commercialization of digital technologies by big tech companies – in this regard, also their democratization needs to be regarded critically.

As elaborated earlier on, already with the emergence of net cultures in the 1990s, there was a profound belief in the ability of technology to enhance collectivity and collaboration forwarded by the Silicon Valley (Apprich, Technotopia 45). At the same time, however, there was an emerging net critique in Europe which also believed in the potential of the new media technologies, but explicitly opposed the US-based Californian Ideology (ibid. 35). For its advocates, participation not solely meant the contribution of content to the emerging social networks, but being part of the growing project as a whole, “determining the directions, rules and enabling infrastructures of one’s own actions in a collective, participatory process.” (Stalder, “Partizipation” 221, own translation) It was then in the subsequent phase of commercialisation and the emergence of Web 2.0 that those “core concepts of the first internet generation – communication, participation, openness to new things […] – [were made] suitable for the masses”, turning ‘participation’ into “user-generated content” (ibid. 223, own translation).

Consequently, while digital media technologies were becoming generally available, “the infrastructures behind these tools [got] increasingly concentrated in the hands of a few, private corporations.” (Apprich, Technotopia 146) And even though their platformization often-times simplified their use (van Dijck, Cultures of Connectivity 6), it was the participation in their design that was closed off in favor of the streamlining and commercialization of user behavior.

With regard to the Californian Ideology, Clemens Apprich considers the instrumentalization of technology as core problematic, which hinders escaping a capitalist logic:

“The problem with this is that technology is not being recognised in its own logic, but rather seen as a means for something else – typically the liberation of the individual from the constraints of society. So, instead of acknowledging the socio-technical potential within it, technology is submitted to a communitarian thinking, which is predominantly defined by capitalist economy.” (Technotopia 144)

As we have seen, these dynamics are very similar to the discourse around AI democratization: both on the side of demands for a community-led AI as well as on the side of big tech, we can recognize not only a wish to make AI accessible for all, but also the belief that “bringing the benefits of AI to everyone” (Google AI, Official Website) will lead to social progress. And particularly in the big tech discourse, this serves economic rationales. While generally a domain reserved for technical experts, under the frame of AI democratization, machine learning is commodified into a form that is easily executable. However, it is not true participation – or democratization – that is enacted here. Rather, the notion of democratization is used as forefront for the establishment of corporate products for AI development as well as free labor via the tasks developers perform on openly accessible corporate frameworks. Moreover, similar to how platform companies today dominate how we perform search, consume content online or how interact with friends and family, so do AI technologies become gradually platformized, with big tech companies such as Google, Microsoft and Amazon competing to become the monopoly provider. At the same time, demanding the integration of community-based organizations and counternarratives to these economic rationales proves increasingly difficult given the dominance big tech has already manifested in the AI industry – materially and discursively.

What we consequently need to do is go beyond the notion of ‘access’ as sole condition for participation. At this point, we might again take as model those 1990s net cultures that Apprich compellingly describes in his search for alternative imaginaries:

“In Europe, but also in the United States and elsewhere, non-commercial Internet Providers (e.g. Backspace, Centre for Culture & Communication, De Digitale Stad, Internationale Stadt, Ljudmila, Silver Server, Public Netbase, The Thing, XS4ALL) did not only offer Internet access, but also a platform for the self-determine use of new media technologies. The idea was to position net critique at the centre of action and to open up spaces of creation and experimentation […].” (Technotopia 37)

Related to the application and development of AI, its democratization should equally mean not only the general availability of technology in the form of its material resources, but also a deeper understanding of and engagement with AI, which means challenging the existing power structures within the industry, but also confronting the inner logics of the technologies. Concepts such as scalability are deeply integrated into the practice of machine learning itself, which requires large amounts of data and high computational power; values such as universal applicability, efficiency, and simplicity dominate its everyday use (Luchs, Apprich, and Broersma); and AI infrastructures are constructed as “uniform blocks ready for further expansion” (Tsing 505), as we can see from their attempts to attract users to their platforms and to expand the reach of their products (which are already extremely difficult to escape). We need to reflect on how we can conceptualize a radically different approach to the creation of ML systems which breaks with these capitalist values that have been nourished for decades and that are deeply intertwined with ML research, development and education – but also, how we can enable a relationship with AI technologies that does not include a mere execution of corporate products, but rather a true participation in their design.

One of the key beliefs of the proponents of big tech, as Joichi Ito states, is “that the world is ‘knowable’ and computationally simulatable, and that computers will be able to process the messiness of the real world just like they have every other problem that everyone said couldn’t be solved by computers.” (4) Instead, he poses, “[w]e need to embrace the unknowability – the irreducibility – of the real world […].” (ibid. 6) One way to conceive of an alternative perspective might thus be to follow a ‘nonscalability theory’ as “alternative for conceptualizing the world” which “pays attention to the mounting pile of ruins that scalability leaves behind” (Tsing 507). For machine learning, this could mean to acknowledge the limitations that it poses – concerning the messiness of reality and the impossibility of lossless translation, but also the messiness of the ML process itself, dealing with dirty data and the political notion of discrimination (Apprich, “Introduction”; Steyerl, “A Sea of Data”).

But also in practically engaging with the technology – in learning to do machine learning and in interacting with its platforms, libraries and datasets – we need to strive for critical practices. We should oppose big tech’s tendency to hide away ML operations behind obfuscating interfaces that we are the users of and look behind them in order to gain a deeper understanding of the technical operations and to acknowledge their embeddedness in our world. In fully understanding this condition, we sooner or later need to ask: is machine learning the best possible way to do data filtering and classification – or might we rather seek for other technological means?

Acknowledgements

I would like to thank the participants of the APRJA workshop “Towards a Minor Tech”, the members of the Groningen “Data Infrastructures and Algorithmic Practices” research group and the anonymous peer reviewer for their extensive feedback on previous drafts as well as for the inspiring conversations.

Works cited

Apprich, Clemens. Technotopia. A Media Genealogy of Net Cultures. Rowman & Littlefield, 2017.

Apprich, Clemens. “Introduction.” Pattern Discrimination, edited by Clemens Apprich, Wendy Hui Kyong Chun, Florian Cramer and Hito Steyerl. meson press/Minnesota Press, 2018, pp. ix-xii.

Barbrook, Richard and Andy Cameron. “The Californian Ideology.” Mute vol. 1, no. 3, September 1995. https://www.metamute.org/editorial/articles/californian-ideology.

Bathgate, Rory. “AWS and Hugging Face Partner to ‘Democratise’ ML, AI Models.” ITPro, 22 February 2023. https://www.itpro.co.uk/cloud/370113/aws-and-hugging-face-partner-to-democratise-ml-ai-models.

Benjamin, Ruha. Race after Technology. Abolitionist Tools for the New Jim Code. Polity Press, 2019.

Boudier, Jeff, Schmid, Philipp and Julien Simon. “Hugging Face and AWS Partner to Make AI More Accessible.” Hugging Face Blog, 21 February 2023. https://huggingface.co/blog/aws-partnership.

Burkhardt, Marcus. “Mapping the Democratization of AI on GitHub. A First Approach.” The Democratization of Artificial Intelligence. Net Politics in the Era of Learning Algorithms, edited by Andreas Sudmann. transcript, 2019, pp. 209-221.

Costanza-Chock, Sasha. “Design Justice: Towards an Intersectional Feminist Framework for Design Theory and Practice.” Proceedings of the Design Research Society, 2018.

Crunchbase. “Hugging Face.” 2023. https://www.crunchbase.com/organization/hugging-face.

D’Ignazio, Catherine and Lauren F. Klein. Data Feminism. The MIT Press, 2020.

Dillet, Romain. “Hugging Face Wants to Become Your Artificial BFF.” TechCrunch, 9 March 2017. https://techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/.

Dyer-Witheford, Nick, Kjøsen, Atle Mikkola and James Steinhoff. Inhuman Power. Artificial Intelligence and the Future of Capitalism. Pluto Press, 2019.

Eubanks, Virginia. Automating Inequality. St. Martin’s Press, 2017.

Goldman, Sharon. “Fresh Off $2B Valuation, ML Platform Hugging Face Touts ‘Open and Collaborative Approach.’” Venture Beat, 9 May 2022. https://venturebeat.com/ai/fresh-off-2b-valuation-machine-learning-platform-hugging-face-highlights-open-and-collaborative-approach/.

Google AI. Official Website. 2023. https://ai.google/.

Google AI. “Principles.” Google AI Official Website, 2023. https://ai.google/principles<.

Google AI. “Tools.” Official Website, 2023. https://ai.google/tools.

Google Developers. “Google I/O Keynote (Google I/O '17).” Uploaded on 17 May 2017. https://www.youtube.com/watch?v=Y2VF8tmLFHw.

Hugging Face. Official Website, 2023. https://huggingface.co.

Hugging Face. “Hugging Face Hub Documentation.” Official Website, 2023. https://huggingface.co/docs/hub/index.

Ito, Joichi. “Resisting Reduction: A Manifesto.” Journal of Design and Science, 2018

Kratz, Jeff. “Accelerating and democratizing research with the AWS Cloud.” AWS Public Sector Blog, 14 September 2022. https://aws.amazon.com/blogs/publicsector/accelerating-democratizing-research-aws-cloud/.

Luchs, Inga, Apprich, Clemens and Marcel Broersma. “Learning Machine Learning. On the Political Economy of AI Online Courses.” Big Data & Society, vol. 10, no. 1, 2023.

Meta AI. “About.” Official Website, 2023. https://ai.facebook.com/about/.

Metz, Cade. “Google Just Open Sourced TensorFlow, Its Artificial Intelligence Engine.” WIRED, 9 November 2015. https://www.wired.com/2015/11/google-open-sources-its-artificial-intelligence-engine/.

Microsoft News Center. “Democratizing AI. For Every Person and Every Organization.” Microsoft Features, 26 September 2016. https://news.microsoft.com/features/democratizing-ai/.

O’Neil, Cathy. Weapons of Math Destruction. Penguin Random House, 2016.

Osman, Luqman and Dawson Sewell. “Hugging Face.” Contrary Research, 14 September 2022. https://research.contrary.com/reports/hugging-face.

Popkin, Helen A.S., Ohsman, Alan and Kenrik Cai. “The AI 50.” Forbes, 9 May 2022. https://www.forbes.com/lists/ai50/?sh=73ac1959290f.

Raymond, Eric S. The Cathedral and the Bazaar. Musings on Linux and Open Source by an Accidental Revolutionary. O’Reilly, 2001.

Srnicek, Nick. “The Political Economy of Artificial Intelligence.” Recorded talk, Great Transformation, Jena, 23.09.-27.09.2019. https://www.youtube.com/watch?v=Fmi3fq3Q3Bo.

Srnicek, Nick. “Data, Compute, Labor.” Digital Work in the Planetary Market, edited by Mark Graham and Fabian Ferrari. The MIT Press, 2022, pp. 241-262.

Stalder, Felix. “Partizipation. Von der Teilhabe zum Spektakel du zurück.” Vergessene Zukunft. Radikale Netzkulturen in Europa, edited by Clemens Apprich and Felix Stalder. transcript, 2012, pp. 219-225.

Steyerl, Hito. “A Sea of Data: Pattern Recognition and Corporate Animism (Forked Version)”. Pattern Discrimination, edited by Clemens Apprich, Wendy Hui Kyong Chun, Florian Cramer and Hito Steyerl. meson press/Minnesota Press, 2018, pp. 1-22

Sudmann, Andreas. The Democratization of Artificial Intelligence. Net Politics in the Era of Learning Algorithms. transcript, 2019.

Tkacz, Nathaniel. Wikipedia and the Politics of Openness. The University of Chicago Press, 2015.

Tsing, Anna. “On Nonscalability: The Living World is Not Amenable to Precision-Nested Scales.” Common Knowledge, vol. 18, no. 3, 2012, pp. 505-524.

Turner, Fred. From Counterculture to Cyberculture. The University of Chicago Press, 2006.

Turner, Fred. “Machine Politics. The Rise of the Internet and a New Age of Authoritarianism.” Harper’s Magazine, 2019. https://harpers.org/archive/2019/01/machine-politics-facebook-political-polarization/.

van Dijck, José. The Culture of Connectivity. A Critical History of Social Media. Oxford University Press, 2013.

Verdegem, Pieter. “Introduction: Why We Need Critical Perspectives on AI.” AI for Everyone? Critical Perspectives, edited by Pieter Verdegem. University of Westminster Press, 2021, pp. 1-18.

Verdegem, Pieter. “Dismantling AI Capitalism: The Commons As an Alternative to the Power Concentration of Big Tech.” AI & Society, 9 April 2022.

West, Sarah Myers, Whittaker, Meredith and Kate Crawford. Discriminating Systems. Gender, Race, and Power in AI. AI Now Institute, 2019.