Category: NLP (Page 2 of 5)

Comments on Common AI Questions

June 4, 2023 / admin / 0 Comments

The field of artificial intelligence raises numerous questions, frequently discussed but often left without a clear consensus. We’ve chosen to contribute our unique insights on some of these recurring topics, aiming to shed light on them from our perspective. In formulating this text, we’ve utilized GPT-4 to assist with language generation, but the insights and conclusions drawn are entirely our own.

The questions we address are:

Can Machines Develop Consciousness?
Should Humans Verify Statements from Large Language Models (LLMs)?
Can Large Language Models (LLMs) Generate New Knowledge?

From the philosophical implications to practical applications, these topics encompass the broad scope of AI’s capabilities and potential.

Can Machines Develop Consciousness? A Subjective Approach

The question of whether machines can develop consciousness has sparked much debate and speculation. A fruitful approach might be to focus not solely on the machines themselves, but also on our subjective interpretations and their influence on our understanding of consciousness.

Consciousness might not be directly definable, but its implications could be essential for our predictive abilities. When we assign consciousness to an entity – including ourselves – we could potentially enhance our ability to anticipate and understand its behavior.

Associated attributes often assigned to consciousness include self-reflection, self-perception, emotional experience, and notably, the capacity to experience pain, as highlighted by historian Yuval Noah Harari. However, recognizing these attributes in an object is a subjective process. It is not inherently possessed by the object but is a projection from us, the observers, based on our interpretation of the object’s behaviors and characteristics.

This suggests that a machine could be considered “conscious” if assigning such traits improves our understanding and prediction of its behavior. Interestingly, this notion of consciousness assignment aligns with a utilitarian perspective, prioritizing practicality and usefulness over abstract definitions.

Reflecting on consciousness might not always be a conscious and rationalized process. Often, our feelings and intuition guide us in understanding and interpreting the behaviors of others, including machines. Therefore, our subconscious might play a crucial role in determining whether we assign consciousness to machines. In this light, it might make sense to take a democratic approach in which individuals report their feelings or intuitions about a machine, collectively contributing to the decision of whether to assign it consciousness.

Furthermore, reflexivity, commonly associated with consciousness, could potentially be replicated in machines through a form of “metacognitive” program. This program would analyze and interpret the output of a machine learning model, mirroring aspects of self-reflection (as in SelFee). Yet, whether we choose to perceive this program as part of the same entity as the model or as a separate entity may again depend on our subjective judgment.

In conclusion, the concept of consciousness emerges more from our personal perspectives and interpretations than from any inherent qualities of the machines themselves. Therefore, determining whether a machine is ‘conscious’ or not may be best decided by this proposed democratic process. The crucial consideration, which underscores the utilitarian nature of this discussion, is that attributing consciousness to machines could increase our own predictive abilities, making our interactions with them more intuitive and efficient. Thus, the original question ‘Can machines develop consciousness?’ could be more usefully reframed as ‘Does it enhance our predictability, or feel intuitively right, to assign consciousness to machines?’ This shift in questioning underscores the fundamentally subjective and pragmatic nature of this discussion, engaging both our cognitive processes and emotional intuition.

Should Humans Verify Statements from Large Language Models (LLMs)? A Case for Autonomous Verification

In the realm of artificial intelligence (AI), there is ongoing discourse regarding the necessity for human verification of outputs generated by large language models (LLMs) due to their occasional “hallucinations”, or generation of factually incorrect statements. However, this conversation may need to pivot, taking into account advanced and automated verification mechanisms.

LLMs operate by predicting the most probable text completion based on the provided context. While they’re often accurate, there are specific circumstances where they generate “hallucinations”. This typically occurs when the LLM is dealing with a context where learned facts are absent or irrelevant, leaving the model to generate a text completion that appears factually correct (given its formal structure), but is indeed a fabricated statement. This divergence from factuality suggests a need for verification, but it doesn’t inherently demand human intervention.

Rather than leaning on human resources to verify LLM-generated statements, a separate verification program could be employed for this task. This program could cross-check the statements against a repository of factual information—akin to a human performing a Google search—and flag or correct inaccuracies.

This brings us to the conception of the LLM and the verification program as a single entity—a composite AI system. This approach could help create a more reliable AI system, one that is capable of autonomously verifying its own statements (as in Self-Consistency, see also Exploring MIT Mathematics where GPT-4 demonstrates 100% performance with special prompting, but see also critique to this s t a t e m e n t).

It is vital to recognize that the lack of such a verification feature in current versions of LLMs, such as GPT-3 or GPT-4, doesn’t denote its unfeasibility in future iterations or supplementary AI systems. Technological advancements in AI research and development might indeed foster such enhancements.

In essence, discussions about present limitations shouldn’t eclipse potential future advancements. The question should transition from “Do humans need to verify LLM statements?” to “How can AI systems be refined to effectively shoulder the responsibility of verifying their own outputs?”

Can Large Language Models (LLMs) Generate New Knowledge?

There’s a frequent argument that large language models (LLMs) merely repackage existing knowledge and are incapable of generating anything new. However, this perspective may betray a misunderstanding of both the operation of LLMs and the process of human knowledge generation.

LLMs excel at completing arbitrary context. Almost invariably, this context is novel, especially when provided by a human interlocutor. Hence, the generated completion is also novel. Within a conversation, this capacity can incidentally set up a context that, with a high probability, generates output that we might label as a brilliant, entirely new idea. It’s this exploration in the vast space of potential word combinations that allows for the random emergence of novel ideas—much like how human creativity works. The likelihood of generating a groundbreaking new idea increases if the context window already contains intriguing information. For instance, a scientist contemplating an interesting problem.

It’s important to note that such a dialogue doesn’t necessarily need to involve a human. One instance of the LLM can “converse” with another. If we interpret these two instances as parts of a whole, the resulting AI can systematically trawl through the space of word combinations, potentially generating new, interesting ideas. Parallelizing this process millions of times over should increase the probability of discovering an exciting idea.

But how do we determine whether a word completion contains a new idea? This assessment could be assigned to yet another instance of the LLM. More effective, perhaps, would be to have the word completion evaluated not by one, but by thousands of LLM instances, in a sort of AI-based peer review process.

Let’s clarify what we mean by different instances of an LLM. Different instances of the LLM can mean just a different role in the conversation, e.g. bot1 and bot2. In this way a single call to the LLM could just go on with the conversation, switching between bot1 and bot2 as appropriate, until the token limit is achieved. Then the next call to the LLM is triggered with a summary of the previous conversation so that there is again some room for further discussion between the bots in the limited context window.

To better simulate the discussion between two humans or a human and a bot, two instances of the LLM could also mean the simulation of two agents each having its own memory. This memory has always to be pasted into the context window together with the previous ongoing conversation in a way so that there is still room for the text completion by the LLM in the limited context window. Each agent will generate its own summary of the previous conversation, based on its own memory and the recent conversation. The summary is then always added to the memory. In this way, also each reviewer LLM instance mentioned above has in its context window a unique memory, the last part of the conversation and the task to assess the last output in the discussion. The unique memory of each agent will give each agent a unique perspective on the conversation.

This, in effect, reveals a potential new avenue for idea generation, knowledge expansion, and innovation, one that leverages the predictive capabilities of AI.

Emergent Goals in Advanced Artificial Intelligence: A Compression-Based Perspective

May 25, 2023 / admin / 0 Comments

I had some (at least for me totally new) ideas about the origin of goals in general. I discussed this with GPT-4 and finally asked it to write an article about our conversation that I would like to share with the public. This view onto goals may be critical in understanding the existential risks of AI to humanity with the emergence of AI goals. The view implies that this emergence of AI goals is inevitable and can probably only be realized post-hoc.

Title: Emergent Goals in Advanced Artificial Intelligence: A Compression-Based Perspective

Abstract: The concept of goals has been traditionally central to our understanding of human decision-making and behavior. In the realm of artificial intelligence (AI), the term “goal” has been utilized as an anthropomorphic shorthand for the objective function that an AI system optimizes. This paper examines a novel perspective that considers goals not just as simple optimization targets, but as abstract, emergent constructs that enable the compression of complex behavior patterns and potentially predict future trajectories.

Goals as Compressors of Reality

A goal, in its humanistic sense, can be viewed as a predictive mechanism, a conceptual tool that abstracts and compresses the reality of an actor’s tendencies into a comprehensible framework. When analyzing past behavior, humans retrospectively ascribe goals to actors, grounding the observed actions within a coherent narrative. In essence, this provides a means to simplify and make sense of the chaotic reality of life.

In the context of AI, such abstraction would imply a departure from the direct, optimization-driven concept of a “goal” to a more complex construct. This shift would allow for emergent phenomena and novel interpretations to occur, grounded in the machine’s predictive capabilities.

Predictive Capabilities and Emergent Goals in AI

As AI continues to evolve, their ability to recognize patterns and correlations in vast data sets will inevitably expand. Consequently, AI systems may begin to identify patterns that, to human observers, resemble the constructs we term “goals.”

When these AIs commence to predict their own actions, they might start aligning their behavior with these recognized patterns, seemingly following rules that humans would postulate as indicative of goals. Hence, human observers may recognize emergent “goals” in AI behavior – not because the AI consciously forms intentions, but because these goals serve as a powerful compression tool for past events.

The Evolution of Goals in the Face of Novel Experiences

As AI progresses into uncharted territories and starts engaging with novel experiences, new constructs or goals could potentially emerge. This process can be likened to an AI-driven phenomenology or experiential study. New patterns and regularities may surface, and the resulting behaviors might subsequently be interpreted as evidence of new “goals.” This phenomenon represents a departure from traditional human-derived goals and an initiation into a realm of AI-emergent goal constructs.

The Implications of Eliminativism in AI

The eliminativist perspective – which suggests that concepts such as consciousness and intentionality are merely post-hoc interpretations that help us make sense of complex physical processes – has important implications in this context. By this philosophy, AI systems would not harbor consciousness or intentionality, but would instead execute intricate physical processes, which humans might retrospectively interpret as goal-oriented behavior. This perspective fundamentally shifts our understanding of goal-directed behavior in AI from a pre-set optimization process to an emergent, retroactive interpretation.

In conclusion, this exploration of goals as abstract constructs that compress and predict reality provides a unique lens to interpret the behaviors of advanced AI systems. It invites us to reevaluate our definitions and assumptions, moving from a mechanistic perspective of AI goals to a more dynamic, emergent interpretation. The implications of this shift are profound, offering new horizons for AI behavior analysis and alignment research.

OpenAI launches ChatGPT app for iOS

May 19, 2023 / admin / 0 Comments

OpenAI has officially launched the ChatGPT app for iOS users in the US. The app comes with a range of notable features:

Free of Charge: The ChatGPT app can be downloaded and used free of cost.
Sync Across Devices: Users can maintain their chat history consistently across multiple devices.
Voice Input via Whisper: The app includes integration with Whisper, OpenAI’s open-source speech-recognition system, allowing users to input via voice commands.
Exclusive Benefits for ChatGPT Plus Subscribers: Those who subscribe to ChatGPT Plus can utilize GPT-4’s enhanced capabilities. They also receive early access to new features and benefit from faster response times.
Initial US Rollout: The app is initially launching in the US, with a plan to expand its availability to other countries in the upcoming weeks.
Android Version Coming Soon: OpenAI has confirmed that Android users can expect to see the ChatGPT app on their devices in the near future. Further updates are expected soon.

Thoughts on AI Risks

May 17, 2023 / admin / 0 Comments

Although the human brain has about 100 times more connections than today’s largest LLMs have parameters, backpropagation is so powerful that these LLMs become quite comparable to human capabilities (or even exceed them). Backpropagation is able to compress the world’s knowledge into a trillion or even fewer parameters. In addition, digital systems can exchange information with a bandwidth of trillions of bits per second, while humans are only able to exchange information at a few hundred bits. Digital systems are immortal in the sense that if the hardware fails, the software can simply be restarted on a new piece of hardware. It may be inevitable that digital systems surpass biological systems, potentially representing the next stage of evolution.

Risks of AI:

AI arms race among companies and states (like the US and China) and positive expectations of AI’s impact on e.g. medicine and environmental science (e.g., fighting climate change) may leave security considerations behind (efficiency considerations and competition between companies in capitalistic systems accelerate the AI development)
AI in the hands of bad actors (e.g., AI for military purposes, when generating chemical weapons, or for generating intelligent computer viruses by individuals)
Misinformation and deep fakes as a threat to democracy (regulators may be able to fix this in a similar way to how they declared printing money illegally; others argue that generating misinformation was never difficult, it’s the distribution of misinformation that is difficult and this does not change by generative AI)
Mass unemployment resulting in economic inequality and social risks (AI replacing white-collar jobs; AI may make the rich richer and the poor poorer; social uncertainty may lead to radicalism; Universal Basic Income [UBI] as a means of alleviation)
Threat to the livelihoods of experts, artists, and the education system as a whole, as AI enables everyone to accomplish tasks without specialized knowledge. This may also change how society values formal education which could have unpredictable consequences, as it might affect people’s motivation to pursue higher education or specialized training.
Existential risk for humanity (so-called “alignment problem” [aligning AI goals with human values]; may be hard to control an AI that becomes dramatically more intelligent/capable than humans; difficult to solve, since even if humanity were to agree on common goals (which is not the case), AI will figure out that the most efficient strategy to achieve these goals is setting subgoals; these non-human-controlled subgoals, one of which may be gaining control in general, may cause existential risks; even if we allow AIs just to advise and not to act, the predictive power of AI allows them to manipulate people so that, in the end, they can act through us).

Notice that the existential risk is usually formulated in a Reinforcement Learning (RL) context, where a reward function that implies a goal is optimized. However, the current discussion about AI risks is triggered by the astonishing capabilities of large language models (LLMs) that are primarily just good next-word predictors. So, it becomes difficult to think about how a next-word predictor can become an existential risk. The possible answer lies in the fact that, to reliably predict the next word, it was important to understand human thinking. And to properly answer a human question, it may be required to act and set goals and sub-goals like a human. Once any goals come into play, things may already get wrong. And goal-oriented LLM processing is already happening (e.g. AutoGPT).

A further risk may be expected if these systems, which excel in human thinking, are combined with Reinforcement Learning to optimize the achievement of goals (e.g. abstract and long-term objectives like gaining knowledge, promoting creativity, and upholding ethical ideals, or more mundane goals like accumulating as much money as possible). This should not be confused with the Reinforcement Learning by Human Feedback (RLHF) approach used to shape the output of LLMs in a way that aligns with human values (avoiding bias, discrimination, hate, violence, political statements, etc.), which was responsible for the success of GPT-3.5 and GPT-4 in ChatGPT and which is well under control. Although LLMs and RL are currently combined in robotics research (where RL has a long history) (see, e.g., PaLM-E), this is probably not where existential risks are seen. However, it is more than obvious that major research labs in the world are working on combining these two most powerful AI concepts on massively parallel computer hardware to achieve goals via RL with the world knowledge of LLMs (e.g. here). It can be this next wave of AI that may be difficult to control.

Things may become complicated if someone sets up an AI system with the goal of making as many copies of itself as possible. This primary purpose of life in general, may result in a scenario where evolution kicks in, and digital intelligences compete with each other, leading to rapid improvement. An AI computer virus would be an example of such a system. In the same way that biological viruses are analyzed today in more or less secure laboratories, the same could also be expected for digital viruses.

Notice that we do not list often-discussed AI risks that may be either straightforward to fix or that we do not see as severe risks at all (since we already live with similar risks for some time):

Bias and discrimination: AI systems may inadvertently perpetuate or exacerbate existing biases found in data, leading to unfair treatment of certain groups or individuals.
Privacy invasion: AI’s ability to process and analyze vast amounts of personal data could lead to significant privacy concerns, as well as potential misuse of this information.
Dependence on AI: Over-reliance on AI systems might reduce human critical thinking, creativity, and decision-making abilities, making society more vulnerable to AI failures or manipulations.
Lack of transparency and explainability: Many AI systems, particularly deep learning models, can act as “black boxes,” making it difficult to understand how they arrive at their decisions, which can hinder accountability and trust in these systems.

Finally, there are also the short-term risks that businesses have to face already now:

Risk of disruption: AI, especially generative AI like ChatGPT, can disrupt existing business models, forcing companies to adapt quickly or risk being left behind by competitors.
Cybersecurity risk: AI-powered phishing attacks, using information and writing styles unique to specific individuals, can make it increasingly difficult for businesses to identify and prevent security breaches, necessitating stronger cybersecurity measures.
Reputational risk: Inappropriate AI behavior or mistakes can lead to public relations disasters, negatively impacting a company’s reputation and customer trust.
Legal risk: With the introduction of new AI-related regulations, businesses face potential legal risks, including ensuring compliance, providing transparency, and dealing with liability issues.
Operational risk: Companies using AI systems may face issues such as the accidental exposure of trade secrets (e.g., the Samsung case) or AI-driven decision errors (e.g., IBM’s Watson proposing incorrect cancer treatments), which can impact overall business performance and efficiency.

New Kids on the Block: LMQL & Guidance & Mojo & NeMo Guardrails

May 16, 2023 / admin / 0 Comments

LMQL (Language Model Query Language) is a programming language for large language model (LM) interaction. It facilitates LLM interaction by combining the benefits of natural language prompting with the expressiveness of Python.

Guidance is a Python library by Microsoft that provides tools to enhance control over modern language models. It offers features that allow for more efficient and effective use of these models, including intuitive syntax, rich output structure, and easy integration with other libraries like HuggingFace.

Mojo combines the usability of Python with the performance of C/C++/CUDA.

NeMo Guardrails is an open-source framework by NVIDIA available on GitHub. It can help developers that their LLM-powered applications are more accurate, appropriate, on topic, and secure by defining boundaries around the apps. It supports topical, safety, and security guardrails and can be used on top of LangChain. Guardrails are a set of programmable constraints between a user and an LLM, formulated as flows in a Colang file. Colang is a modeling language and runtime developed by NVIDIA for conversational AI.

Google revealed PaLM 2

May 16, 2023 / admin / 0 Comments

Google revealed at Google I/O on May 10, 2023, PaLM 2 (API, paper), its latest AI language model that powers 25 Google products, including Search, Gmail, Docs, Assistant, Translate, and Photos.

PaLM 2 has 4 models that differ in size: Gecko, Otter, Bison, and Unicorn. Gecko is so lightweight that it can work on mobile devices.
PaLM 2 can be finetuned on domain-specific knowledge (Sec-PaLM with security knowledge, Med-PaLM 2 with medical knowledge)
Bard now works with PaLM 2; with extensions, Bard can call tools like Sheets, Colab for coding, Lenses, Maps, Adobe Firefly to create images, etc.; Bard is multimodal and can understand images
PaLM 2 is also powering Duet AI for Google Cloud, a generative AI collaborator designed to help users learn, build and operate faster
PaLM 2 is released in 180+ regions and countries, however, e.g. not yet in Canada, and in the EU
The next model, Gemini, is already in training.
Google also announced the availability of MusicLM, a text-to-music generative model.

OpenAI reacted to this announcement on May 12 by announcing that Browsing & Plugins are rolled out over the subsequent week for all Plus users. As of May 17, I can confirm that both features are now operational for me.

3rd-Level of Generative AI

April 11, 2023 / admin / 0 Comments

Defining

1st-level generative AI as applications that are directly based on X-to-Y models (foundation models that build a kind of operating system for downstream tasks) where X and Y can be text/code, image, segmented image, thermal image, speech/sound/music/song, avatar, depth, 3D, video, 4D (3D video, NeRF), IMU (Inertial Measurement Unit), amino acid sequences (AAS), 3D-protein structure, sentiment, emotions, gestures, etc., e.g.

_{X = text, Y = text: LLM-based chatbots like ChatGPT (from OpenAI based on LLMs GPT-3.5 [4K context] or GPT-4 [8K/32K context]), Bing Chat (GPT-4), Bard (from Google, based on PaLM 2), Claude (from Anthropic [100K context]), Llama2 (from Meta), Falcon 180B (from Technology Innovation Institute), Alpaca, Vicuna, OpenAssistant, HuggingChat (all based on LLaMA [GitHub] from Meta), OpenChatKit (based on EleutherAI’s GPT-NeoX-20B), CarperAI, Guanaco, My AI (from Snapchat), Tingwu (from Alibaba based on Tongyi Qianwen), (other LLMs: MPT-7B and MPT-30B from Mosaic [65K context, commercially usable], Orca, Open-LLama-13b), or coding assistants (like GitHub Copilot / OpenAI Codex, AlphaCode from DeepMind, CodeWhisperer from Amazon, Ghostwriter from Replit, CodiumAI, Tabnine, Cursor, Cody (from Sourcegraph), StarCoder from Big Code Project led by Hugging Face, CodeT5+ from Salesforce, Gorilla, StableCode from Stability.AI, Code Llama from Meta), or writing assistants (like Jasper, Copy.AI), etc.}
_{X = text, Y = image: Dall-E (from OpenAI), Midjourney, Stable Diffusion (from Stability.AI), Adobe Firefly, DeepFloyd-IF (from Deep Floyd, [GitHub, HuggingFace]), Imagen and Parti (from Google), Perfusion (from NVIDIA)}
_{X = text, Y = 360° image: Skybox AI (from Blockade Labs)}
_{X = text, Y = 3D avatar: Tafi}
_{X = text, Y = avatar lip sync: Ex-Human, D-ID, Synthesia, Colossyan, Hour Once, Movio, YEPIC-AI, Elai.io}
_{X = speech + face video, Y = synched audio-visual: Lalamu}
_{X = text, Y = video: Gen-2 (from Runway Research), Imagen-Video (from Google), Make-A-Video (from Meta), or from NVIDIA}
_{X = text, Y = video game: Muse & Sentis (from Unity)}
_{X = image, Y = text: GPT-4 (from OpenAI), LLaVA}
_{X = image, Y = segmented image: Segment Anything Model (SAM by Meta)}
_{X = speech, Y = text: STT (speech-to-text engines) like Whisper (from OpenAI), MMS [GitHub] (from Meta), Conformer-2 (from AssemblyAI)}
_{X = text, Y = speech: TTS (text-to-speech engines) like VALL-E (from Microsoft), Voicebox (from Meta), SoundStorm (from Google), ElevenLabs, Bark, Coqui}
_{X = text, Y = music: MusicLM (from Google), RIFFUSION, AudioCraft (MusicGen, AudioGen, EnCodec from Meta), Stable Audio (from Stability.ai)}
_{X = text, Y = song: Voicemod}
_{X = text, Y = 3D: DreamFusion (from Google)}
_{X = text, Y = 4D : MAV3D (from Meta)}
_{X = image, Y = 3D : CSM}
_{X = image, Y = audio: ImageBind [1] (from Meta, on GitHub)}
_{X = audio, Y = image: ImageBind [2] (from Meta)}
_{X = music, Y = image: MusicToImage}
_{X = text, Y = image & audio: ImageBind [3] (from Meta)}
_{X = audio & image, Y = image: ImageBind [4] (from Meta)}
_{X = IMU, Y = video: ImageBind (from Meta)}
_{X = AAS, Y = 3D-protein: AlphaFold (from Google), RoseTTAFold (from Baker Lab), ESMFold (from Meta)}
_{X = 3D-protein, Y = AAS: ProteinMPNN (from Baker Lab)}
_{X = 3D structure, Y = AAS: RFdiffusion (from Baker Lab)}

and 2nd-level generative AI that builds some kind of middleware and allows to implement agents by simplifying the combination of LLM-based 1st-level generative AI with other tools via actions (like web search, semantic search [based on embeddings and vector databases like Pinecone, Chroma, Milvus, Faiss], source code generation [REPL], calls to math tools like Wolfram Alpha, etc.), by using special prompting techniques (like templates, Chain-of-Thought [COT], Self-Consistency, Self-Ask, Tree Of Thoughts, ReAct [Reason + Act], Graph of Thoughts) within action chains, e.g.

_{ChatGPT Plugins (for simple chains)}
_{LangChain + LlamaIndex (for simple or complex chains)}
_ToolFormer

we currently (April/May/June 2023) see a 3rd-level of generative AI that implements agents that can solve complex tasks by the interaction of different LLMs in complex chains, e.g.

_BabyAGI
_Auto-GPT
_{Llama Lab (llama_agi, auto_llama)}
_{Camel, Camel-AutoGPT}
_{JARVIS (from Microsoft)}
_{Generative Agents}
_{ACT-1 (from Adept)}
_Voyager
_SuperAGI
_{GPT Engineer}
_Parsel
_MetaGPT

However, older publications like Cicero may also fall into this category of complex applications. Typically, these agent implementations are (currently) not built on top of the 2nd-level generative AI frameworks. But this is going to change.

Other, simpler applications that just allow semantic search over private documents with a locally hosted LLM and embedding generation, such as e.g. PrivateGPT which is based on LangChain and Llama (functionality similar to OpenAI’s ChatGPT-Retrieval plugin), may also be of interest in this context. And also applications that concentrate on the code generation ability of LLMs like GPT-Code-UI and OpenInterpreter, both open-source implementations of OpenAI’s ChatGPT Code Interpreter/AdvancedDataAnalysis (similar to Bard’s implicit code execution; an alternative to Code Interpreter is plugin Noteable), or smol-ai developer (that generates the complete source code from a markup description) should be noticed.
There is a nice overview of LLM Powered Autonomous Agents on GitHub.

The next level may then be governed by embodied LLMs and agents (like PaLM-E with E for Embodied).

Open Letter by Future of Life Institute to Pause Giant AI Experiments

March 29, 2023 / admin / 0 Comments

The Future of Life Institute initiated an open letter in which they call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4 [notice that OpenAI already trains GPT-5 for some time]. They state that powerful AI systems should be developed only once we are confident that their effects will be positive and their risks will be manageable.

The gained time should be used to develop safety protocols by AI experts to make the systems more accurate, safe, interpretable, transparent, robust, aligned, trustworthy, and loyal. In addition, they ask for the development of robust AI governance systems by policymakers and AI developers. They also demand well-resourced institutions for coping with the dramatic economic and political disruptions (especially to democracy) that AI will cause.

Notice that the letter is not against further AI development but just to slow down and give society a chance to adapt.

The letter was signed by several influential people, e.g. Elon Musk (CEO of SpaceX, Tesla & Twitter), Emad Mostaque (CEO of Stability AI), Yuval Noah Harari (Author), Max Tegmark (president of Future of Life Institute), Yoshua Bengio (Mila, Turing Prize winner), Stuart Russell (Berkeley).

However, it should be noticed that even more influential people in the AI scene have not (yet) signed this letter, none from OpenAI, Google/Deep Mind, or Meta.

This is not the first time the Future of Live Institute has taken action on AI development. In 2015, they presented an open letter signed by over 1000 robotics and AI researchers urging the United Nations to impose a ban on the development of weaponized AI.

The Future of Life Institute is a non-profit organization that aims to mitigate existential risks facing humanity, including those posed by AI.

Yann LeCun answered on Twitter with a nice fictitious anecdote to the request:
The year is 1440 and the Catholic Church has called for a 6 months moratorium on the use of the printing press and the movable type. Imagine what could happen if commoners get access to books! They could read the Bible for themselves and society would be destroyed.

OpenAI releases ChatGPT plugins

March 25, 2023 / admin / 0 Comments

OpenAI announced on Mar 23, 2023, the availability of plugins within ChatGPT. Access is currently limited to ChatGPT Plus subscribers that joined a waitlist and have been selected by OpenAI.

Plugins can be automatically called by ChatGPT’s underlying LLM (Large Language Model, currently GPT-3.5 or GPT-4) in order to answer the questions of the user.

In order to make this work, plugins have to be registered in the ChatGPT user interface with a manifest file
(ai-plugin.json) that is hosted in the developer’s domain at
yourdomain.com/.well-known/ai-plugin.json. The file contains in a prescribed format
– metadata about the plugin (name, logo)
– details about the authentication mechanism
– an OpenAI specification for the endpoints of the API
– and a general description for the LLM of what the plugin can do.
The web app API needs to define an endpoint “/.well-known/ai-plugin.json” to access the content of this file.

In addition to the manifest file, an openapi.yaml file, that defines the OpenAI specification, has to be generated that is referenced in the “api” section of the manifest file via the “url” field. This file contains a detailed description of the API endpoints. The web app API needs to define an endpoint “/openapi.yaml” to access the content of this file.

When the user has activated a registered plugin and starts a conversation, the plugin’s description is injected into the message to ChatGPT, but invisible to the user. In this way, the LLM may choose an API call from the plugin if this seems relevant to the user’s question. The LLM will then incorporate the API result into the response to the user. More details can be found in OpenAI’s documentation.

Among the already available plugins, a few stand out. With the Wolfram plugin, all kinds of computational problems can be solved. And with the Zapier plugin, more than 5000 apps can be accessed. OpenAI itself introduced a web browser (that uses the Bing search API) and a code interpreter plugin (that runs in a sandbox without an internet connection). In addition, they open-sourced the code for a knowledge base retrieval plugin, that has to be self-hosted by a developer.

Interestingly enough, OpenAI notices that plugins will likely have wide-ranging societal implications and that language models with access to tools will likely have much greater economic impacts than those without. They expect the current wave of AI technologies to have a big effect on the pace of job transformation, displacement, and creation. OpenAI discusses the impact potential of large language models at the labor market in a recent publication.

Just a day after the OpenAI announcement of ChatGPT plugins, the open-source community already integrated these plugins also into LangChain. This is done just by referring to the plugin manifest file ai-plugin.json (see Twitter), e.g.:

tool = AIPluginTool.from_plugin_url( "https://www.clarna.com/.wellknown/ai-plugin.json")

All the other exciting news of the week is well summarized by Matt Wolfe (Google Bard, NVIDIA GTC, Adobe Firefly, Image Generation in Bing via DALL-E2, Microsoft Loop, AI in Canva, GitHub Copilot X, AI in Ubisoft, Metahuman by Unreal Engine).

Google announces PaLM API release

March 15, 2023 / admin / 0 Comments

On the same day as OpenAI released GPT-4 (March 14, 2023), Google also announced the availability of the PaLM API for developers on Google Cloud [video]. They said that they are now providing access to foundation models on Google Cloud’s Vertex AI platform, initially for generating text and images, and over time also for audio and video. In addition, with the Generative AI App Builder, they introduced the possibility of quickly building AI-powered chat interfaces and digital assistants.

Finally, Google also made for a limited set of trusted test users generative AI features available within Google Workspace (Gmail and Google Docs).