AI Studies

OpenAI Publications and Papers

We researched Scopus and Google Scholar (leading databases for academic research and publications) to curate a complete list of OpenAI publications and papers.

We looked at two leading databases for academic research and publications, Scopus and Google Scholar, to curate a complete list of OpenAI publications and papers. The data is organized in an Airtable for convenient analysis.

This list will remain updated as an easy-to-reference location for OpenAI-affiliated publications. 

To learn more about OpenAI, read our OpenAI Partnerships List and OpenAI Patent List.

Overview of Airtable Columns 

The Airtable includes the following columns, each representing critical information about the publications:

  • Year: The year the publication was released.
  • Source Title: The title of the journal or conference where the publication appeared.
  • Title: The title of the publication.
  • Authors: The authors who contributed to the publication.
  • Page Count: The number of pages in the publication.
  • DOI: The Digital Object Identifier, a unique identifier for the publication.
  • Link: A link to the publication (Scopus and/or Google Scholar).
  • Affiliations: The affiliations of the authors.
  • Abstract: A brief summary of the publication.
  • Publisher: The entity that published the journal or conference proceedings.
  • Document Type: The type of document (e.g., article, conference paper).
  • PubMed ID: The PubMed identifier, if applicable.

Findings of Note

Number of publications by document type 

Out of the 164 papers found in the Scopus database:

  • 122 (74%) were conference papers 
  • 34 (20.7%) were articles
  • The rest were spread between reviews, editorials, book chapters, etc. 

Number of publications by subject area

Out of the 164 papers found in the Scopus database:

  • Computer Science dominated with 45.5% of the publications.
  • Engineering followed with 13.0%.
  • Social Sciences contributed 11.4% of the papers.
  • Arts and Humanities accounted for 9.7%.
  • Mathematics made up 7.7%.
  • Multidisciplinary fields had 2.0% of the publications.
  • Agricultural and Biological Sciences, Biochemistry, Genetics and Molecular Biology, and Physics and Astronomy each had 1.7%.

The remaining publications were grouped under "Other" and constituted 5.7% of the total publications.

This distribution highlights the focus on Computer Science and Engineering, which together make up the majority of OpenAI's research outputs.

Number of publications by country 

Out of the 164 papers found in the Scopus database:

  • The United States came on top with 162 of the papers.
  • The United Kingdom followed with 18 papers.
  • Canada was next with 16 papers.
  • Germany had 8 papers, while Switzerland had 7 papers.
  • Belgium contributed 6 papers.
  • Japan and the Netherlands each had 5 papers.
  • Poland and France both had 4 papers.

The distribution shows that the majority of publications are concentrated in a few key countries, with the United States leading by a significant margin.

2024 OpenAI Publications — A Brief Overview

1. A Qubit, a Coin, and an Advice String Walk Into a Relational Problem

  • Year: 2024
  • Source title: Leibniz International Proceedings in Informatics, LIPIcs
  • Page count: N/A
  • DOI: 10.4230/LIPIcs.ITCS.2024.1
  • Link: Link
  • Affiliations: University of Texas at Austin, TX, United States
  • Publisher: Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, Germany
  • Authors: Aaronson S.; Buhrman H.; Kretschmer W.
  • Document Type: Conference paper
  • PubMed ID: N/A
  • Abstract: Relational problems (those with many possible valid outputs) are different from decision problems, but it is easy to forget just how different. This paper initiates the study of FBQP/qpoly, the class of relational problems solvable in quantum polynomial-Time with the help of polynomial-sized quantum advice, along with its analogues for deterministic and randomized computation (FP, FBPP) and advice (/poly, /rpoly). Our first result is that FBQP/qpoly/= FBQP/poly, unconditionally, with no oracle - a striking contrast with what we know about the analogous decision classes. The proof repurposes the separation between quantum and classical one-way communication complexities due to Bar-Yossef, Jayram, and Kerenidis. We discuss how this separation raises the prospect of near-Term experiments to demonstrate "quantum information supremacy," a form of quantum supremacy that would not depend on unproved complexity assumptions. Our second result is that FBPP/ FP/poly - that is, Adleman s Theorem fails for relational problems - unless PSPACE NP/poly. Our proof uses IP = PSPACE and time-bounded Kolmogorov complexity. On the other hand, we show that proving FBPP/FP/poly will be hard, as it implies a superpolynomial circuit lower bound for PromiseBPEXP. We prove the following further results: Unconditionally, FP/= FBPP and FP/poly/= FBPP/poly (even when these classes are carefully defined). FBPP/poly = FBPP/rpoly (and likewise for FBQP). For sampling problems, by contrast, SampBPP/poly/= SampBPP/rpoly (and likewise for SampBQP).

2. Demonstrating a Long-Coherence Dual-Rail Erasure Qubit Using Tunable Transmons

  • Year: 2024
  • Source title: Physical Review X
  • Page count: NaN
  • DOI: 10.1103/PhysRevX.14.011051
  • Link: Link
  • Affiliations: AWS Center for Quantum Computing, Pasadena, 91125, CA, United States
  • Publisher: American Physical Society
  • Authors: Levine H.; Haim A.; Hung J.S.C.; Alidoust N.; Markovitch I.; O’Brien T.E.; Vool U.
  • Document Type: Article
  • PubMed ID: N/A
  • Abstract: Quantum error correction with erasure qubits promises significant advantages over standard error correction due to favorable thresholds for erasure errors. To realize this advantage in practice requires a qubit for which nearly all errors are such erasure errors, and the ability to check for erasure errors without dephasing the qubit. We demonstrate that a "dual-rail qubit"consisting of a pair of resonantly coupled transmons can form a highly coherent erasure qubit, where transmon T1 errors are converted into erasure errors and residual dephasing is strongly suppressed, leading to millisecond-scale coherence within the qubit subspace. We show that single-qubit gates are limited primarily by erasure errors, with erasure probability perasure=2.19(2)×10-3 per gate while the residual errors are ∼40 times lower. We further demonstrate midcircuit detection of erasure errors while introducing <0.1% dephasing error per check. Finally, we show that the suppression of transmon noise allows this dual-rail qubit to preserve high coherence over a broad tunable operating range, offering an improved capacity to avoid frequency collisions. This work establishes transmon-based dual-rail qubits as an attractive building block for hardware-efficient quantum error correction. 

3. Correction to “Compressed sensing in the presence of speckle noise”

  • Year: 2024
  • Source title: IEEE Transactions on Information Theory
  • Page count
  • DOI: 10.1109/TIT.2024.3409274
  • Link: Link
  • Affiliations: OpenAI, San Francisco, CA, USA; Department of Electrical Engineering, Stanford University, Stanford, CA, USA
  • Publisher: Institute of Electrical and Electronics Engineers
  • Authors: Zhou W.; Jalali S.; Maleki A.
  • Document Type: Article
  • PubMed ID: N/A
  • Abstract: This paper presents a correction to Theorem 2 in [1] which follows from fixing an error in Lemma 5 and a minor correction in the constant of Lemma 3. Despite modifications to upper bounds and constants, the core conclusions of the original paper remain unaffected. The revised proofs now feature precise constants for clarity, maintaining the original findings’ integrity. IEEE

4. AI is a viable alternative to high throughput screening: a 318-target study

  • Year: 2024
  • Source title: Scientific Reports
  • Page count: NaN
  • DOI: 10.1038/s41598-024-54655-z
  • Link: Link
  • Affiliations: Atomwise Inc., San Francisco, United States; Amazon Web Services, USA
  • Publisher: Nature Research
  • Authors: Wallach I.; Bernard D.; Nguyen K.; Ho G.; Morris Q.
  • Document Type: Article
  • PubMed ID: N/A
  • Abstract: High throughput screening (HTS) is routinely used to identify bioactive small molecules. This requires physical compounds, which limits coverage of accessible chemical space. Computational approaches combined with vast on-demand chemical libraries can access far greater chemical space, provided that the predictive accuracy is sufficient to identify useful molecules. Through the largest and most diverse virtual HTS campaign reported to date, comprising 318 individual projects, we demonstrate that our AtomNet® convolutional neural network successfully finds novel hits across every major therapeutic area and protein class. We address historical limitations of computational screening by demonstrating success for target proteins without known binders, high-quality X-ray crystal structures, or manual cherry-picking of compounds. We show that the molecules selected by the AtomNet® model are novel drug-like scaffolds rather than minor modifications to known bioactive compounds. Our empirical results suggest that computational methods can substantially replace HTS as the first step of small-molecule drug discovery.

5. Beyond dominance and Nash: Ranking equilibria by critical mass

  • Year: 2024
  • Source title: Games and Economic Behavior
  • Page count: 16
  • DOI: 10.1016/j.geb.2024.01.011
  • Link: Link
  • Affiliations: OpenAI, 3180 18th Street, San Francisco, 94110, United States
  • Publisher: Academic Press Inc.
  • Authors: Kalai A.T.; Kalai E.
  • Document Type: Article
  • PubMed ID: NaN
  • Abstract: Strategic interactions pose central issues that are not adequately explained by the traditional concepts of dominant strategy equilibrium (DSE), Nash equilibrium (NE), and their refinements. A comprehensive analysis of equilibrium concepts within the von Neumann-Nash framework of n-person optimization reveals a decreasing hierarchy of n nested concepts ranging from DSE to NE. These concepts are defined by the “critical mass,” the number of players needed to adopt and sustain the play of a strategy profile as an equilibrium. In games with n>2 players, the n−2 intermediate concepts explain strategic issues in large social systems, implementation, decentralization, as well as replication studied in economics, operations management, and political games. 

6. Efficient Reinforcement Learning with Impaired Observability: Learning to Act with Delayed and Missing State Observations

  • Year: 2024
  • Source title: IEEE Transactions on Information Theory
  • Page count: 0
  • DOI: 10.1109/TIT.2024.3416202
  • Link: Link
  • Affiliations: Princeton University, United States; Stanford University, United States
  • Publisher: Institute of Electrical and Electronics Engineers
  • Authors: Chen M.; Meng J.; Bai Y.; Ye Y.; Vincent Poor H.
  • Document Type: Article
  • PubMed ID: N/A
  • Abstract: In real-world reinforcement learning (RL) systems, various forms of impaired observability can complicate matters. These situations arise when an agent is unable to observe the most recent state of the system due to latency or lossy channels, yet the agent must still make real-time decisions. This paper introduces a theoretical investigation into efficient RL in control systems where agents must act with delayed and missing state observations. We present algorithms and establish near-optimal regret upper and lower bounds, of the form Õ(√poly( H ) SAK ), for RL in the delayed and missing observation settings. Here S and A are the sizes of state and action spaces, H is the time horizon and K is the number of episodes. Despite impaired observability posing significant challenges to the policy class and planning, our results demonstrate that learning remains efficient, with the regret bound optimally depending on the state-action size of the original system. Additionally, we provide a characterization of the performance of the optimal policy under impaired observability, comparing it to the optimal value obtained with full observability. Numerical results are provided to support our theory.

7. PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation

  • Year: 2024
  • Source title: International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
  • Page count: 18
  • DOI: 10.1145/3620665.3640366
  • Link: Link
  • Affiliations: Meta; OpenAI; Quansight; Intel; University of California, Berkeley
  • Publisher: Association for Computing Machinery
  • Authors: Ansel J.; Yang E.; He H.; Gimelshein N.; Jain R.
  • Document Type: Conference paper
  • PubMed ID: NaN
  • Abstract: This paper introduces two extensions to the popular PyTorch machine learning framework, TorchDynamo and TorchInductor, which implement the torch.compile feature released in PyTorch 2. TorchDynamo is a Python-level just-in-time (JIT) compiler that enables graph compilation in PyTorch programs without sacrificing the flexibility of Python. It achieves this by dynamically modifying Python bytecode before execution and extracting sequences of PyTorch operations into an FX graph, which is then JIT compiled using one of many extensible backends. TorchInductor is the default compiler backend for TorchDynamo, which translates PyTorch programs into OpenAI's Triton for GPUs and C++ for CPUs. Results show that TorchDynamo is able to capture graphs more robustly than prior approaches while adding minimal overhead, and TorchInductor is able to provide a 2.27× inference and 1.41× training geometric mean speedup on an NVIDIA A100 GPU across 180+ real-world models, which outperforms six other compilers. These extensions provide a new way to apply optimizations through compilers in eager mode frameworks like PyTorch.

Sources and Methodology

The data for this article was primarily sourced from Scopus and Google Scholar, two leading databases for academic publications. We cross-referenced these sources to compile a comprehensive list of OpenAI's publications. 

  • Scopus: A comprehensive abstract and citation database for peer-reviewed literature.
  • Google Scholar: A freely accessible web search engine that indexes the full text or metadata of scholarly literature.

Then, the data was organized in an Airtable to facilitate analysis and visualization.

Jonathan Gillham

Founder / CEO of Originality.ai I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of MotionInvest.com, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!

More From The Blog

AI Content Detector & Plagiarism Checker for Marketers and Writers

Use our leading tools to ensure you can hit publish with integrity!