Grokking (machine learning)

Grokking (machine learning)

In machine learning, grokking, or delayed generalization, is a phenomenon observed in some settings where a model abruptly transitions from overfitting (performing well only on training data) to generalizing (performing well on both training and test data), after many training iterations with little or no improvement on the held-out data. This contrasts with what is typically observed in machine learning, where generalization occurs gradually alongside improved performance on training data. == Origin == Grokking was introduced by OpenAI researcher Alethea Power and colleagues in the January 2022 paper "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets". It is derived from the word grok coined by Robert Heinlein in his novel Stranger in a Strange Land. In ML research, "grokking" is not used as a synonym for "generalization"; rather, it names a sometimes-observed delayed‑generalization training phenomenon in which training and held‑out performance do not improve in tandem, and in which held‑out performance rises abruptly later. Authors also analyze the "grokking time", the epoch or step at which this transition occurs in those scenarios. == Interpretations == Grokking can be understood as a phase transition during the training process. In particular, recent work has shown that grokking may be due to a complexity phase transition in the model during training. While grokking has been thought of as largely a phenomenon of relatively shallow models, grokking has been observed in deep neural networks and non-neural models and is the subject of active research. One potential explanation is that the weight decay (a component of the loss function that penalizes higher values of the neural network parameters, also called regularization) slightly favors the general solution that involves lower weight values, but that is also harder to find. According to Neel Nanda, the process of learning the general solution may be gradual, even though the transition to the general solution occurs more suddenly later. Recent theories have hypothesized that grokking occurs when neural networks transition from a "lazy training" regime where the weights do not deviate far from initialization, to a "rich" regime where weights abruptly begin to move in task-relevant directions. Follow-up empirical and theoretical work has accumulated evidence in support of this perspective, and it offers a unifying view of earlier work as the transition from lazy to rich training dynamics is known to arise from properties of adaptive optimizers, weight decay, initial parameter weight norm, and more. This perspective is complementary to a unifying "pattern learning speeds" framework that links grokking and double descent; within this view, delayed generalization can arise across training time ("epoch‑wise") or across model size ("model‑wise"), and the authors report "model‑wise grokking".

Personoid

Personoid is the concept coined by Stanisław Lem, a Polish science-fiction writer, in Non Serviam, from his book A Perfect Vacuum (1971). His personoids are an abstraction of functions of human mind and they live in computers; they do not need any human-like physical body. In cognitive and software modeling, personoid is a research approach to the development of intelligent autonomous agents. In frame of the IPK (Information, Preferences, Knowledge) architecture, it is a framework of abstract intelligent agent with a cognitive and structural intelligence. It can be seen as an essence of high intelligent entities. From the philosophical and systemics perspectives, personoid societies can also be seen as the carriers of a culture. According to N. Gessler, the personoids study can be a base for the research on artificial culture and culture evolution. == Personoids on TV and cinema == Welt am Draht (1973) The Thirteenth Floor (1999)

Modulation error ratio

The modulation error ratio (MER) is a measure used to quantify the performance of a digital radio (or digital TV) transmitter or receiver in a communications system using digital modulation (such as QAM). A signal sent by an ideal transmitter or received by a receiver would have all constellation points precisely at the ideal locations, however various imperfections in the implementation (such as noise, low image rejection ratio, phase noise, carrier suppression, distortion, etc.) or signal path cause the actual constellation points to deviate from the ideal locations. Transmitter MER can be measured by specialized equipment, which demodulates the received signal in a similar way to how a real radio demodulator does it. Demodulated and detected signal can be used as a reasonably reliable estimate for the ideal transmitted signal in MER calculation. == Definition == An error vector is a vector in the I-Q plane between the ideal constellation point and the point received by the receiver. The Euclidean distance between the two points is its magnitude. The modulation error ratio is equal to the ratio of the root mean square (RMS) power (in Watts) of the reference vector to the power (in Watts) of the error. It is defined in dB as: M E R ( d B ) = 10 log 10 ⁡ ( P s i g n a l P e r r o r ) {\displaystyle \mathrm {MER(dB)} =10\log _{10}\left({P_{\mathrm {signal} } \over P_{\mathrm {error} }}\right)} where Perror is the RMS power of the error vector, and Psignal is the RMS power of ideal transmitted signal. MER is defined as a percentage in a compatible (but reciprocal) way: M E R ( % ) = P e r r o r P s i g n a l × 100 % {\displaystyle \mathrm {MER(\%)} ={\sqrt {P_{\mathrm {error} } \over P_{\mathrm {signal} }}}\times 100\%} with the same definitions. MER is closely related to error vector magnitude (EVM), but MER is calculated from the average power of the signal. MER is also closely related to signal-to-noise ratio. MER includes all imperfections including deterministic amplitude imbalance, quadrature error and distortion, while noise is random by nature.

Digital redlining

Digital redlining is the practice of creating and perpetuating inequities between already marginalized groups specifically through the use of digital technologies, digital content, and the internet. The concept of digital redlining is an extension of the practice of redlining in housing discrimination, a historical legal practice in the United States and Canada dating back to the 1930s where red lines were drawn on maps to indicate poor and primarily black neighborhoods that were deemed unsuitable for loans or further development, which created great economic disparities between neighborhoods. The term was popularized by Dr. Chris Gilliard, a privacy scholar, who defines digital redlining as "the creation and maintenance of tech practices, policies, pedagogies, and investment decisions that enforce class boundaries and discriminate against specific groups". Though digital redlining is related to the digital divide and techniques such as weblining and personalization, it is distinct from these concepts as part of larger complex systemic issues. It can refer to practices that create inequities of access to technology services in geographical areas, such as when internet service providers decide to not service specific geographic areas because they are perceived to be not as profitable and thus reduce access to crucial services and civic participation. It can also be used to refer to inequities caused by the policies and practices of digital technologies. For instance, with these methods inequities are accomplished through divisions that are created via algorithms which are hidden from the technology user; the use of big data and analytics allow for a much more nuanced form of discrimination that can target specific vulnerable populations. These algorithmic means are enabled through the use of unregulated data technologies that apply a score to individuals that statistically categorize personality traits or tendencies which are similar to a credit score but are proprietary to the technology companies and not under outside oversight. == Digital redlining and geography == While the roots of redlining lie in excluding populations based on geography, digital redlining occurs in both geographical and non-geographical contexts. An example of both contexts can be found in the charges brought against Facebook on March 28 of 2019, by the United States Department of Housing and Urban Development (HUD). HUD charged Facebook with violating the Fair Housing Act of 1968 by "encouraging, enabling, and causing housing discrimination through the company's advertising platform." HUD stated that Facebook allowed advertisers to “exclude people who live in a specified area from seeing an ad by drawing a red line around that area.” The discrimination called out by HUD included those that were racist, homophobic, ableist, and classist. Besides this example of geographically based digital redlining, HUD also charged that Facebook used profile information and designations to exclude classes of people. The charges stated: "Facebook enabled advertisers to exclude people whom Facebook classified as parents; non-American-born; non-Christian; interested in accessibility; interested in Hispanic culture; or a wide variety of other interests that closely align with the Fair Housing Act’s protected classes" Several media outlets pointed out HUDs own history of housing discrimination through redlining, the establishment of the Fair Housing Act to combat redlining, and how the digital platform was recreating this discriminatory practice. === Digital redlining within a geographical context === Although digital redlining refers to a complex and varied set of practices, it has been most commonly applied to practices with a geographical dimension. Common examples include when an internet service providers decide to not service specific geographic areas because those areas are seen to be not as profitable, resulting in discrimination against low-income communities, with resulting impacts on access to crucial services and civic participation. AT&T has faced specific scrutiny for this form of digital redlining, it has been reported that AT&T has been classist in its offerings of broadband internet service in areas that are more impoverished. Geographically based digital redlining can also apply to digital content or the distribution of goods sold online. Geographically based games such as Pokémon Go have been shown to offer more virtual stops and rewards in geographic areas that are less ethnically and racially diverse. In 2016, Amazon was rebuked for not offering their Prime same-day delivery service to many communities that were largely African American and had incomes that were beneath the national average. Even services such as email can be impacted, with many email administrators creating filters for flagging particular email messages as spam based on the geographical origin of the message. === Digital redlining based on personal identity === Although often aligned with discrimination that falls into a geographically based context digital redlining also refers to when vulnerable populations are targeted for or excluded from specific content or access to the internet in a way that harms them based on some aspect of their identity. Trade schools and community colleges, which typically have a more working class student body, have been found to block public internet content from their students where elite research institutions do not. The use of big data and analytics allow for a much more nuanced form of discrimination that can target specific vulnerable populations. For example, Facebook has been criticized for providing tools that allow advertisers to target ads by ethnic affinity and gender, effectively blocking minorities from seeing specific ads for housing and employment. In October 2019, a major class action lawsuit was filed against Facebook alleging gender and age discrimination in financial advertising. A broad array of consumers can be particularly vulnerable to digital redlining when it is used outside of a geographical context. Besides targeting vulnerable populations based on traditional and legally recognized classifications such as race, gender, age, etc., it has been shown that personal data mined and then resold by brokers can be used to target those who have been identified as suffering from Alzheimer's or dementia, or simply identified as impulse buyers or gullible. == Term distinctions == === Distinctions between weblining and digital redlining === Earlier distinctions have been made between weblining—the process of charging customers different prices based on profile information --- and internet or digital redlining, with digital redlining being focused not on pricing but access. As early as 2002 the Gale Encyclopedia of E-Commerce puts forth the distinction more in use today: weblining is the pervasive and generally accepted (or at least tolerated) practice of personalizing access to products and services in ways invisible to the user; digital redlining is when such personalized, data-driven schemes perpetuate traditional advantages of privileged demographics. As weblining has become more ubiquitous, the term has fallen out of use in favor of the more general term personalization. === Distinctions between the digital divide and digital redlining === Scholars have often drawn connections between the digital divide and digital redlining. In practice, the digital divide is seen as one of a number of impacts of digital redlining, and digital redlining is one of a number of ways in which the divide is maintained or extended. == Criticisms == A 2001 report looked to find if the reason for a gap in access to broadband internet by low-income and minority populations was due to a lack of availability or due to other factors. The report found that there was "little evidence of digital redlining based on income or black or Hispanic concentrations" but that there was mixed evidence of redlining based on areas in which Native American or Asian populations were larger.

Paperless society

A paperless society is a society in which paper communication (written documents, email, letters, etc.) is replaced by electronic communication and storage. The concept was first introduced by Frederick Wilfrid Lancaster in 1978. Furthermore, libraries would no longer be needed to handle printed documents. "Librarians will, in time, become information specialists in a deinstitutionalized setting". Lancaster also stated that both computers and libraries will not always give us the information that other people and living life will. == Literature == Brodman, E. (1979). Review of Toward Paperless Information Systems. Bulletin of the Medical Library Association, 67(4), 437–439. Buckland, M. K. (1980). Review of Toward Paperless Information Systems. Journal of Academic Librarianship, 5(6), 349. Grosch, A. (1979). Review of Toward Paperless Information Systems. College & Research Libraries, 40(1), 88–89. Kohl, D. F. (2004). From the editor . . . The paperless society . . . Not quite yet. Journal of Academic Librarianship, 30(3), 177–178. Lancaster, F. W. (1978a). Toward paperless information systems. New York: Academic Press. Lancaster, F. W. (1980b). The future of the librarian lies outside of the library. Catholic Library World, 51, 388–391. Lancaster, F. W. (1982a). Libraries and librarians in an age of electronics. Arlington, VA: Information Resources Press. Lancaster, F. W. (1982b). The evolving paperless society and its implications for libraries. International Forum on Information and Documentation, 7(4), 3–10. Lancaster, F. W. (1983). Future librarianship: Preparing for an unconventional career. Wilson Library Bulletin, 57, 747–753. Lancaster, F. W. (1985). The paperless society revisited. American Libraries, 16, 553–555. Lancaster, F. W. (1993). Libraries and the future: Essays on the library in the twenty-first century. New York: Haworth Press. Lancaster, F. W. (1999). Second thoughts on the paperless society. Library Journal, 124(15), 48– 50. Lancaster, F. W., & Smith, L. C. (1980c). On-Line systems in the communication process: Projections. Journal of the American Society for Information Science, 31(3), 193–200. Miall, D. S. (2001). The library versus the Internet: Literary studies under siege? Proceedings of the Modern Language Association, 116(5), 1405–1414. Salton, G. (1979). Review of Toward Paperless Information Systems. Journal of Documentation, 35(3), 250–252. Sellen, A. J., & Harper, R. H. R. (2003). The myth of the paperless office. Cambridge, MA: MIT Press. Stevens, N. D. (2006). The fully electronic academic library. College & Research Libraries, 67(1),5–14. Young, Arthur P. (2008).Aftermath of a Prediction: F. W. Lancaster and the Paperless Society LIBRARY TRENDS, 56(4),(“The Evaluation and Transformation of Information Systems: Essays Honoring the Legacy of F. W. Lancaster,” edited by Lorraine J. Haricombe and Keith Russell), pp. 843–858.

Gitter

Gitter is an open-source instant messaging and chat room system for developers and users of GitLab and GitHub repositories. Gitter is provided as software as a service, with a free option providing all basic features and the ability to create a single private chat room, and paid subscription options for individuals and organisations, which allows them to create arbitrary numbers of private chat rooms. Individual chat rooms can be created for individual Git repositories on GitHub. Chatroom privacy follows the privacy settings of the associated GitHub repository: thus, a chatroom for a private (i.e. members-only) GitHub repository is also private to those with access to the repository. A graphical badge linking to the chat room can then be placed in the git repository's README file, bringing it to the attention of all users and developers of the project. Users can chat in the chat rooms, or access private chat rooms for repositories they have access to, by logging into Gitter via GitHub. Gitter is similar to Slack. Like Slack, it automatically logs all messages in the cloud. In late 2020, New Vector Limited acquired Gitter from GitLab, and announced Gitter's features would eventually be moved to New Vector's flagship product, Element, thereby replacing Gitter entirely. On February 13, 2023, Gitter migrated their service to a custom-branded Matrix instance that uses Element for its web interface. == Features prior to Migration to Matrix == Gitter supports: Notifications, which are batched up on mobile devices to avoid annoyance Inline media files Viewing and subscribing to ("starring") multiple chat rooms in one web browser tab Linking to individual files in the linked git repository Linking to GitHub issues (by typing # and then the issue number) in the linked Git repository, with hovercards showing the details of the issue GitHub-flavored Markdown in chat messages Online status for users User hovercards, based on their GitHub profiles and statistics (number of GitHub followers, etc.) Browsable and searchable message archives, grouped by month Connection from IRC clients Gitter on iOS support authentication using GitHub or Twitter === Integrations with non-GitHub sites and applications === Gitter integrates with Trello, Jenkins, Travis CI, Drone (software), Heroku, and Bitbucket, among others. === Apps === Official Gitter apps for Windows, Mac, Linux, iOS and Android are available. === Account registration === Like other chat technologies, Gitter allows clients to instant message each other. It allows people to authenticate using a GitHub account and join a chatroom from a web browser, thus not requiring one to install any software, or create additional online accounts. == History == Gitter was created by some developers who were initially trying to create a generic web-based chat product, but then wrote extra code to hook their chat application up to GitHub to meet their own needs, and realised that they could turn the combined product into a viable specialist product in its own right. Gitter came out of beta in 2014. During the beta period, Gitter delivered 1.8 million chat messages. On March 15, 2017, GitLab announced the acquisition of Gitter. Included in the announcement was the stated intent that Gitter would continue as a standalone project. It was published as open source under an MIT License as of June 2017. On September 30, 2020, New Vector Limited acquired Gitter from GitLab, and announced upcoming support for the Matrix protocol in Gitter, which went live by the end of the year. Gitter's features would eventually be moved to New Vector's flagship product, Element, thereby replacing Gitter entirely. On February 13, 2023, Gitter migrated their service to a custom-branded Matrix instance that uses Element for its web interface. == Implementation prior to Migration to Matrix == The Gitter web application is implemented entirely in JavaScript, with the back end being implemented on Node.js. The source code to the web application was formerly proprietary (it was open-sourced in June 2017), although Gitter had made numerous auxiliary projects available as open-source software, such as an IRC bridge for IRC users who prefer using IRC client applications (and their extra features) to converse in the Gitter chat rooms.

Usage share of operating systems

The usage share of an operating system is the percentage of computers running that operating system (OS). These statistics are estimates as wide scale OS usage data is difficult to obtain and measure. Reliable primary sources are limited and data collection methodology is not formally agreed. Currently devices connected to the internet allow for web data collection to approximately measure OS usage. As of December 2025, Android, which uses the Linux kernel, is the world's most popular operating system with 38.94% of the global market, followed by Windows with 29.99%, iOS with 15.66%, macOS with 2.14%, and other operating systems with 10.78%. This is for all device types excluding embedded devices. For smartphones and other mobile devices, Android has 72% market share, and Apple's iOS has 28%. For desktop computers and laptops, Microsoft Windows has 60.8%, followed by unknown operating systems at 19.7%, Mac OS at 14.4%, desktop Linux at 3.2%, then Google's ChromeOS at 1.6%, as of March 2026. For tablets, Apple's iPadOS (a variant of iOS) has 52% share and Android has 48% worldwide. For the top 500 most powerful supercomputers, Linux distributions have had 100% of the market share since 2017. The global server operating system market share has Linux leading with a 63.1% marketshare, followed by Windows, Unix and other operating systems. Linux is also most used for web servers, and the most common Linux distribution is Ubuntu, followed by Debian. Linux has almost caught up with the second-most popular (desktop) OS, macOS, in some regions, such as in South America, and in Asia it's at 6.4% (7% with ChromeOS) vs 9.7% for macOS. In the US, ChromeOS is third at 5.5%, followed by (desktop) Linux at 4.3%. The most numerous type of device with an operating system are embedded systems. Not all embedded systems have operating systems, instead running their application code on the "bare metal"; of those that do have operating systems, a high percentage are standalone or do not have a web browser, which makes their usage share difficult to measure. Some operating systems used in embedded systems are more widely used than some of those mentioned above; for example, modern Intel microprocessors contain an embedded management processor running a version of the Minix operating system. == Worldwide device shipments == Shipments (to stores) do not necessarily translate to sales to consumers, therefore suggesting the numbers indicate popularity and/or usage could be misleading. Not only do smartphones sell in higher numbers than PCs, but also a lot more by dollar value, with the gap only projected to widen, to well over double. According to Gartner, the following is the worldwide device shipments (referring to wholesale) by operating system from 2012 to 2016, which includes smartphones, tablets, laptops and PCs together. On 27 January 2016, Paul Thurrott summarized the operating system market, the day after Apple announced "one billion devices": Apple's "active installed base" is now one billion devices. [..] Granted, some of those Apple devices were probably sold into the marketplace years ago. But that 1 billion figure can and should be compared to the numbers Microsoft touts for Windows 10 (200 million, most recently) or Windows more generally (1.5 billion active users, a number that hasn’t moved, magically, in years), and that Google touts for Android (over 1.4 billion, as of September). My understanding of iOS is that the user base was previously thought to be around 800 million strong, and when you factor out Macs and other non-iOS Apple devices, that's probably about right. But as you can see, there are three big personal computing platforms. And only one of them is actually declining. We’ll see how Windows 10 fares over the long term, but even if Microsoft hits the 1 billion figure in 1-2 years as promised, it will by then still be the smallest of those three platforms. In 2018, Apple stopped revealing unit sales in its reports. Since 2018, the company have been publishing only revenues per device models which, nonetheless, allowed the analysers to extrapolate the unit sales from the model revenues by applying the wholesale device prices. Other hardware manufacturers usually do not report unit sales. === PC shipments === For 2015 (and earlier), Gartner reports for "the year, worldwide PC shipments declined for the fourth consecutive year, which started in 2012 with the launch of tablets" with an 8% decline in PC sales for 2015 (not including cumulative decline in sales over the previous years). Microsoft backed away from their goal of one billion Windows 10 devices in three years (or "by the middle of 2018") and reported on 26 September 2016 that Windows 10 was running on over 400 million devices, and in March 2019, on more than 800 million. In May 2020, Gartner predicted further decline in all market segments for 2020 due to COVID-19, predicting a decline of 13.6% for all devices. while the "Work from Home Trend Saved PC Market from Collapse", with only a decline of 10.5% predicted for PCs. However, in the end, according to Gartner, PC shipments grew 10.7% in the fourth quarter of 2020 and reached 275 million units in 2020, a 4.8% increase from 2019 and the highest growth in ten years." Apple in 4th place for PCs had the largest growth in shipments for a company in Q4 of 31.3%, while "the fourth quarter of 2020 was another remarkable period of growth for Chromebooks, with shipments increasing around 200% year over year to reach 11.7 million units. In 2020, Chromebook shipments increased over 80% to total nearly 30 million units, largely due to demand from the North American education market." Chromebooks sold more (30 million) than Apple's Macs worldwide (22.5 million) in pandemic year 2020. According to the Catalyst group, the year 2021 had record high PC shipments with total shipments of 341 million units (including Chromebooks), 15% higher than 2020 and 27% higher than 2019, while being the largest shipment total since 2012. According to Gartner, worldwide PC shipments declined by 16.2% in 2022, the largest annual decrease since the mid-1990s, due to geopolitical, economic, and supply chain challenges. In 2024 and 2025, due to lower adoption of Windows 11 and Microsoft ending its support to Windows 10, the number of PCs shipped with pre-installed Windows OS dropped. Pundits attribute the low Windows 11 acceptance to its steep hardware requirements and especially the TPM 2.0 ready chipset requirement and the 2024 CrowdStrike-related IT outages. Meanwhile, the macOS device market share in PC device shipments increased to new heights, with improved numbers seen for Linux devices too. In Q3 2025, the macOS pre-installed device shipments increased by 14.9% year-over-year (YoY), while the overall PC-shipments increased only by 8.1%, in Q2 2025, it grew 21.4% YoY while the global PC-shipments increased only by 6.5%, and in Q1 2025, it grew 7% YoY while the global PC-shipments increased by 4.8%. === Tablet computers shipments === In 2015, eMarketer estimated at the beginning of the year that the tablet installed base would hit one billion for the first time (with China's use at 328 million, which Google Play doesn't serve or track, and the United States's use second at 156 million). At the end of the year, because of cheap tablets – not counted by all analysts – that goal was met (even excluding cumulative sales of previous years) as: Sales quintupled to an expected 1 billion units worldwide this year, from 216 million units in 2014, according to projections from the Envisioneering Group. While that number is far higher than the 200-plus million units globally projected by research firms IDC, Gartner and Forrester, Envisioneering analyst Richard Doherty says the rival estimates miss all the cheap Asian knockoff tablets that have been churning off assembly lines.[..] Forrester says its definition of tablets "is relatively narrow" while IDC says it includes some tablets by Amazon — but not all.[..] The top tech purchase of the year continued to be the smartphone, with an expected 1.5 billion sold worldwide, according to projections from researcher IDC. Last year saw some 1.2 billion sold.[..] Computers didn’t fare as well, despite the introduction of Microsoft's latest software upgrade, Windows 10, and the expected but not realized bump it would provide for consumers looking to skip the upgrade and just get a new computer instead. Some 281 million PCs were expected to be sold, according to IDC, down from 308 million in 2014. Folks tend to be happy with the older computers and keep them for longer, as more of our daily computing activities have moved to the smartphone.[..] While Windows 10 got good reviews from tech critics, only 11% of the 1-billion-plus Windows user base opted to do the upgrade, according to Microsoft. This suggests Microsoft has a ways to go before the software gets "hit" status. Apple's new operating system El Capitan has been