[{"content":"Perplexity has unveiled Personal Computer, a system that combines local files with the Perplexity Computer and runs 24/7 on a Mac Mini. It is designed to understand your goals, use different tools, and continue working even when you step away. A waitlist is now open for anyone interested.\nHow does Personal Computer work?The system runs on a Mac Mini and is available around the clock. The Mac Mini is a compact computer from Apple known for its small size – it is about 20 times smaller than a typical PC tower and weighs only 670 grams.\nThe choice of Apple’s Mac Mini is not accidental. Recently, many companies have been using this model to run local AI agents because of its relatively low price. In practice, the Mac Mini often functions as an interface and storage hub, while the AI agents themselves may operate through external language model services.\nOne of the most prominent examples is OpenClaw, which launched a local AI agent capable of connecting to other language model services that perform tasks on its behalf.\nMalcolm Owen, Product Comparison Expert at AppleInsider, noted that Perplexity’s Personal Computer appears to have been created partly in response to OpenClaw and its close ties with OpenAI and ChatGPT. He also points out that other major companies are developing local AI-powered solutions as well. For example, Claude Cowork is a desktop assistant designed to work with a user’s local files. Perplexity’s new development appears to be a significantly expanded version of this idea.\n“Personal Computer is a digital proxy for you, working constantly on your behalf and allowing you to orchestrate all of your tools, tasks, and files from any device, anywhere,” Perplexity stated.\nIn addition to efficiency, the new personal computer is designed to operate within a secure environment with clear protection mechanisms. Each session generates an audit log, a kill switch gives users full control, and sensitive actions require confirmation.\nWhat does this mean for companies?Personal Computer can be used by large organizations to support and streamline their workflows. According to the developers, Perplexity Computer saved their internal team $1.6 million and completed 3.25 years of work in just four weeks.\nComputer for Enterprise does not require additional infrastructure and connects directly to tools already used by companies. Even more importantly, the system is trainable and easy to integrate, meaning it can be customized to fit almost any company workflow. Perplexity says that through connectors, the system can work directly with Snowflake, Salesforce, HubSpot, and hundreds of other platforms.\nThis means, for example, that a financial analyst can request revenue data by vertical from Snowflake, while the sales team simultaneously receives CRM insights and competitive context. The Computer generates queries, executes them, and returns structured results, the company explains.\nUsing Slack, teams can interact with the system to write code with Codex and Claude, create dashboards, financial models, and presentations – and most importantly, launch scheduled processes asynchronously.\nThe creators also emphasize that the Computer runs on a secure platform with SOC 2 Type II compliance, SAML SSO and audit logs.\nComet Enterprise – browser automation with native AIComet Enterprise introduces an organizational browser with native AI capabilities. The browser understands the context of open tabs and automates many tasks.\nAdministrators have control over how the assistant operates. For example, they can allow it to answer questions without taking action, enable more proactive automation in low-risk areas or review logs of every session.\nEmployees can install Comet on their devices with centralized deployment via MDM. Additional capabilities include applying browser policies, blocking domains and extensions, monitoring activity through exportable telemetry.\nComet Enterprise also focuses heavily on security. The product is developed in partnership with CrowdStrike – a cybersecurity company known for its cloud platform CrowdStrike Falcon, which functions as a next-generation antivirus (NGAV) and endpoint detection and response system (EDR). The platform uses AI to protect against malware, ransomware and hacking attacks.\n“Through our partnership with CrowdStrike, Comet Enterprise customers gain additional browser-level protections, including visibility into installed extensions, risk scores, and protections that help prevent sensitive information from being entered into Comet.” , Perplexity declared.\nWhat does the Computer mean for developers?Perplexity is expanding its platform with four APIs: Search, Agent, Embeddings, and Sandbox. These are the same building blocks that power Computer internally, but they are now available through the platform’s API.\nThe main functions of each API include:\nSearch – retrieving reliable information Agent – delegating multi-step tasks Sandbox – a secure execution environment Embeddings – powering search and ranking systems Working with premium data sourcesAnother key feature is Premium Sources, which allows professional data providers to be integrated into Perplexity. The company notes that many of these sources are normally paid services, but Perplexity makes them accessible directly through the platform. Currently, Computer can access Statista, CB Insights, and PitchBook.\nImproved financial capabilitiesThe company also introduced an updated Perplexity Finance, providing direct access to more than 40 financial tools. These include SEC filings, FactSet, S\u0026amp;P Global, Coinbase, LSEG, Quartr, and others. All of them require no additional configuration, licenses, or API keys, making them easier to use.\nMoreover, The Computer can now generate Excel models, build full financial applications, and create interactive dashboards. These tools are available both for personal use and for broader institutional analysis.\nThe system can also act as a full financial assistant by pulling data from prediction markets such as Polymarket, and by connecting brokerage accounts through Plaid to analyze portfolios and risk exposure.\nBackground: from Perplexity Computer to Personal ComputerAt the end of February, the company announced the release of Perplexity Computer, whose core idea was to unify multiple AI models into a single system.\nThe creators explained that in AI this concept is known as “orchestration” – a management layer coordinating multiple models. They argued that collaboration between several models can produce results far beyond the capabilities of any single model.\nFor this purpose, they created a “computer” – a multimodel system that combines different AI systems for efficient, autonomous, and personalized work.\nPerplexity wrote:\n“The next phase of AI must be massively multimodel. That’s Perplexity Computer. Nineteen models are available in the backend, and the outputs have impressed me more than anything in AI for a while.”\nPerplexity described its first experimental system as ASI – Artificial Super Intelligence. The idea was to treat it as a digital worker.\nIn practice, the workflow looked like this: a user sends a task through Slack, the system delegates subtasks, creates files to support its own work, searches for relevant information, and periodically checks progress. The process runs without human involvement, and by the next day users can receive the results of what would normally be a week of work.\nAt its core, the concept suggests that the computer of the future is not simply a device with a graphical interface (GUI), but a system capable of interacting with the web, files, and software tools like a personal digital worker through model orchestration.\nThe newly announced system is essentially a modified and expanded version of the original Perplexity Computer – now personalized for each user and connected to local applications as well as secure Perplexity servers.\nPerplexity CEO Aravind Srinivas, commenting on the launch on X, noted:\n“It’s personal and more powerful than any AI system ever launched.”\n","permalink":"https://coursiv.io/news/perplexity-personal-computer-ai-agents/","summary":"Perplexity launches Personal Computer, an always-on AI-agent system running on Mac Mini with enterprise automation, security controls, and API integrations.","title":"Perplexity Introduces a Personal Computer Built Around AI Agents"},{"content":"According to reports from The Information, OpenAI has begun developing its own code-hosting platform. The company wants the new platform to be closely integrated with AI tools, allowing the service to automatically suggest improvements, resolve issues, and generate code. One of the main reasons behind the move is the growing number of service outages during which GitHub owned by Microsoft became unavailable.\nThe project is still in its earliest stages and may take several months to develop, according to The Information. The company is also reportedly considering making the code repository available for purchase to its existing customer base. The move may signal a growing strategic distance between OpenAI and Microsoft, one of its most important partners and the owner of GitHub.\nHow Codex fits into OpenAI\u0026rsquo;s platform strategyThe initiative to build its own platform may also be closely connected to OpenAI\u0026rsquo;s existing coding system – Codex – which can translate natural language instructions into code, write features, run tests, and answer questions about a codebase. Integrating a repository with Codex\u0026rsquo;s capabilities could create a more unified development environment in which AI agents actively participate in building and maintaining applications.\nAdditionally, this move reflects a broader trend within major tech companies regarding software development. Microsoft, Meta, and Amazon have reported that AI is now generating a significant portion of their internal code, accelerating the shift toward automated programming workflows.\nWhat is GitHub and how OpenAI uses itGitHub is a platform for developers where they can store code, share it with others, and collaborate on writing it. As a cloud-based service, it enables users to track and manage changes to their code over time, while also giving other contributors access to projects so they can suggest improvements. It is worth noting that when working collaboratively, the changes made by each person do not affect the overall development process, which makes this feature especially powerful for teams working on the same project.\nOpenAI uses GitHub as a platform for code storage, collaborative development, and publishing open-source projects. A key point is GitHub\u0026rsquo;s integration with AI tools, which resulted in the joint AI assistant – GitHub Copilot – designed to help programmers write code faster and with less effort.\nThe company has also highlighted the ability to connect GitHub repositories to ChatGPT applications, as well as to the ChatGPT agent, in order to ask questions based on a user\u0026rsquo;s own code.\n\u0026ldquo;When you connect to GitHub, ChatGPT can pull live data from your repositories – code, README files, and other docs – and reason over it in real time, either with an app with sync, an app with file search, or an app with deep research. Just connect, ask a question, and ChatGPT will read, analyze, and cite the relevant snippets straight from your GitHub content.\u0026rdquo; – OpenAI\nWhat this means for the competitive landscapeIt\u0026rsquo;s worth noting that building proprietary code repositories is already a common practice, with companies like Google and Meta using their own internal systems to manage codebases. Yet, if the project is successfully implemented by OpenAI, it would mark a step toward a new type of developer platform, shifting the focus from traditional source code management to generative AI and it could significantly impact competition among major tech giants.\nAccording to Reuters, OpenAI\u0026rsquo;s latest funding round valued the company at $840 billion, as recent investments from Big Tech and Masayoshi Son\u0026rsquo;s SoftBank totaled $110 billion, providing the resources for OpenAI to pursue the development of its own code repository.\nHowever, currently GitHub remains the most popular cloud-based platform for storing and collaborating on code, with over 180 million users.\n","permalink":"https://coursiv.io/news/openai-code-hosting-platform-github-alternative/","summary":"OpenAI has begun developing its own code-hosting platform with deep AI integration, signaling a strategic shift away from Microsoft\u0026rsquo;s GitHub and closer ties to its Codex coding system.","title":"OpenAI Is Developing Its Own Code-Hosting Platform as a Potential GitHub Alternative"},{"content":"According to an internal memo obtained by The Wall Street Journal, Meta is moving further by building a new engineering organization focused on applied artificial intelligence – an initiative designed to accelerate the development of what many in the industry increasingly refer to as superintelligence.\nSuperintelligence is a hypothetical form of artificial intelligence that would surpass human capabilities across all cognitive tasks.\nFrom restructuring to a new engineering organizationLast summer, Meta restructured its AI operations, leading to the launch of Meta Superintelligence Labs, headed by former Scale AI CEO Alexander Wang. The creation of the new engineering team is a direct consequence of this restructuring.\nThe new organization will be led by Maher Saba, vice president at Reality Labs, the division responsible for Meta\u0026rsquo;s metaverse-related products and will focus on building developer tools, data pipelines, and evaluation systems. These components are intended to improve AI models through feedback loops and real-world data. Business Insider also reports that the new organization will work closely with Meta Superintelligence Labs, which focuses on developing Meta\u0026rsquo;s most advanced AI models.\nTeam structure and operational modelAccording to The Wall Street Journal, Maher Saba stated that the new organization will operate with two teams. One team will focus on building interfaces and development tools, while the other will handle operational tasks – conducting evaluations, generating datasets, and delivering them to the development teams. The key advantage of this structure is the ability to produce higher-quality training data and more efficient evaluation processes.\n\u0026ldquo;Building great models isn\u0026rsquo;t just about researchers and compute; it requires real-world data, feedback and evals. This creates the flywheel that turns a strong model into a leading one. Lately, we\u0026rsquo;ve seen some excellent gains from reinforcement learning and post-training and we believe we have a real opportunity to move faster and pull ahead if we double down on these efforts.\u0026rdquo; – Maher Saba\nFlat management as a competitive advantageNotably, the engineering organization will operate with an unusually flat management structure. The ratio of managers to engineers could reach as high as one manager for every 50 engineers. According to reports, this approach is designed to empower individual engineers and enable large teams to operate with minimal management layers, aligning with Mark Zuckerberg\u0026rsquo;s philosophy. He has repeatedly stated that reducing hierarchy helps elevate the role of individual contributors, driving faster development and keeping the team adaptable.\nWhat does this tell usMeta currently operates through three main groups: the research laboratory led by Wang, the applied AI engineering organization headed by Saba, and the broader technology strategy under Bosworth. Together, these teams drive the development of cutting-edge technologies, supporting the company\u0026rsquo;s push toward advanced AI and superintelligence.\nThat said, the AI race is gradually shifting from building models themselves to developing the infrastructure around them.\nCompanies are increasingly focused on controlling data pipelines, developer tools, and evaluation systems – elements that may ultimately determine the leaders of the next phase of artificial intelligence development. Meta\u0026rsquo;s decision to create a new organization further highlights the ambitions of major tech companies and their growing focus on building infrastructure as a key element of their strategy to achieve superintelligence.\n","permalink":"https://coursiv.io/news/meta-superintelligence-labs-new-engineering-team/","summary":"Meta is building a new applied AI engineering organization under Maher Saba to accelerate superintelligence development, focusing on developer tools, data pipelines, and evaluation systems.","title":"Meta Superintelligence Labs Pushes Toward Advanced AI with New Engineering Team"},{"content":"Google has announced one of its most significant updates for enterprise customers – the public preview release of Gemini Embedding 2, which is the first embedding model designed to be natively multimodal. Built on the Gemini architecture, the model can map text, images, video, audio, and documents into a single embedding space, enabling multimodal search and classification across different types of media.\nThe model is already available in Public Preview via the Gemini API and Vertex AI.\nWhat is an embeddingAn embedding is a numerical representation of data that allows models to analyze and interpret information. Embeddings can be used not only for text, but also for images, audio, and video.\nText embeddings are numerical vectors that map words or phrases into a multidimensional space. These vectors help machine learning models recognize meaning and relationships within text.\nAs Google explains:\n\u0026ldquo;Expanding on our previous text-only foundation, Gemini Embedding 2 maps text, images, videos, audio and documents into a single, unified embedding space, and captures semantic intent across over 100 languages. This simplifies complex pipelines and enhances a wide variety of multimodal downstream tasks—from Retrieval-Augmented Generation (RAG) and semantic search to sentiment analysis and data clustering.\u0026rdquo;\nKey features of the new embedding modelGemini Embedding 2 can generate embeddings for a wide range of data types. Here are the capabilities of the model across different media formats:\nText: supports context up to 8192 output tokens Images: can process up to 6 images per request and supports formats such as PNG and JPEG Video: supports videos up to 120 seconds in MP4 and MOV formats Audio: can process audio data directly without requiring text transcription Documents: can generate embeddings for PDF files up to 6 pages An important clarification: these limits do not apply to the system\u0026rsquo;s memory or storage capacity, but only to the amount of input data per request. This means that you cannot, for example, upload hundreds of pages of a PDF file at once. Instead, the document needs to be split into segments of up to six pages and each segment should be sent separately.\nAnother important feature is that the model has a cumulative effect. After the fragments you send are converted into vectors, they can be stored together in a database, which later enables search across all files. The same principle also applies to video and audio formats.\nHow Gemini Embedding 2 differs from previous approachesPreviously, different approaches were used to connect two different types of data. For example, CLIP relied on two separate encoders: one designed for text and the other for images. The reason this approach did not work perfectly is that the separate encoders processed the data independently, and their outputs were later aligned using contrastive learning.\nBecause this alignment occurred only at the final stage, it missed deeper cross-modal connections. With this approach, the system could understand that certain images correspond to certain pieces of text, but it was not always able to capture the more complex interactions and relationships between different types of data, some of which could have formed in the intermediate layers of the network.\nA key advantage of the model is that it can natively understand combined inputs, such as image + text within a single request. This allows the model to process complex data and capture relationships between different types of media.\nMatryoshka Representation LearningOne of the nuances when working with embeddings is that higher dimensional vectors can capture more details, but they also require more memory. To address this, Gemini Embedding 2 uses a technique called Matryoshka Representation Learning (MRL).\nThe idea behind it is that a single representation vector can be truncated to fewer dimensions while still retaining its usefulness for tasks such as search or text comparison. In other words, the technique allows information to be nested within the vector, reducing its size while preserving performance.\nMRL directs the most important information into the earliest dimensions of the vector, instead of distributing the semantic signal evenly across all 3072 dimensions. To use this feature, developers need to pass the output_dimensionality parameter.\nIt is important to note that dimensions smaller than 3072 are not normalized by default, so vectors must be normalized manually before computing similarity. If this step is skipped, the resulting distance metrics may become distorted.\n\u0026ldquo;This allows flexible scaling of the output embedding size, reducing it from the default dimension of 3072. As a result, developers can balance performance and storage costs. For the best quality, we recommend using dimensions of 3072, 1536, or 768.\u0026rdquo;\nTwo-stage retrieval with MRLAnother advantage of MRL is that it enables an effective two-stage retrieval algorithm. In the first stage, smaller vectors can be used for fast retrieval from the index. In the second stage, the retrieved results can be re-ranked using the full 3072-dimensional vectors. This approach gives developers the accuracy of a large model with the latency profile of a much smaller one.\nPerformance and multimodal capabilitiesGemini Embedding 2 stands out for its performance in multimodal tasks, including advanced capabilities for working with speech, text, images, and video.\nThe new embedding model represents direct competition with other multimodal embedding providers, as Google is taking a significant step forward by enabling support for different types of data within a single embedding space.\nIn practice, this could reduce the need for separate pipelines for each data modality, since a unified model simplifies the process of handling multiple types of data. However, it is still important to note that in many real-world deployments additional layers remain necessary, including metadata processing, compliance requirements, and access control mechanisms.\nOverall, Google states that the new model sets a new performance standard, improving how developers work with embeddings and enabling more efficient multimodal systems.\nMeaning for enterprise databasesEmbeddings are widely used by Google itself in various products, including Retrieval-Augmented Generation (RAG), large-scale data management, and traditional search systems.\nInformation within companies often exists as a fragmented and scattered set of data. A customer may face the same issue that involves a PDF document, several emails, meaning text, as well as audio. In the past, working with each of these formats required four separate pipelines, but with the new Gemini 2 Embedding model the situation changes significantly. With the new embedding, it becomes much easier for organizations to perform search regardless of the data format.\nAccording to the company, several early-access partners are already using the model to build multimodal applications.\n\u0026ldquo;We chose Gemini embeddings to help legal professionals find critical information during the discovery process in litigation – a highly technical challenge in a high-stakes setting, and one Gemini excels at.\u0026rdquo; – Max Christoff, CTO of Everlaw\nGetting started and pricingThe model is already available for testing and integration into projects, but it is still subject to updates and improvements. A full, stable release (General Availability, GA) will be rolled out later.\nDevelopers can explore interactive Colab notebooks for the Gemini API and Vertex AI to learn how to implement the new model.\nGemini Embedding 2 can also be integrated with popular frameworks and tools such as LangChain, LlamaIndex, Haystack, Weaviate, Qdrant, ChromaDB, and Vector Search.\nIntegrations with the above-mentioned ecosystems are important because embedding models are rarely used in isolation. They are typically placed behind a vector index, which stores embeddings of the data corpus and performs nearest-neighbor search. This infrastructure enables a wide range of applications, including enterprise search, customer support assistants, and content moderation systems.\nAccess channelsAccess to the new Gemini Embedding 2 model is available through two main channels:\nGemini API – designed for developers and rapid prototyping, with a simplified pricing model. Ideal for startups and small businesses. Vertex AI (Google Cloud) – built for large enterprises and high-scale projects, offering advanced security features and seamless integration with the broader Google Cloud ecosystem. Gemini API pricing Free Tier: Developers can experiment with the model at no cost, subject to usage limits (typically 60 requests per minute). Data from this tier may be used by Google to improve its products. Paid Tier: For production-level usage: text, images, and video at $0.25 per 1M tokens; native audio (without transcription) at $0.50 per 1M tokens. Vertex AI pricing for enterprises Flex PayGo: Ideal for unpredictable workloads; pay only for what you use. Provisioned Throughput: Guarantees consistent capacity and low-latency performance for high-traffic applications. Batch Prediction: Optimized for processing massive historical archives where speed is less critical but volume is extremely high. All official Gemini API and Vertex AI Colab notebooks are licensed under Apache License 2.0, a permissive license that allows developers to use, modify, and even commercialize the code without the obligation to open-source their own projects.\n","permalink":"https://coursiv.io/news/google-gemini-embedding-2-multimodal-model/","summary":"Google releases Gemini Embedding 2, the first natively multimodal embedding model that maps text, images, video, audio, and documents into a single embedding space via the Gemini API and Vertex AI.","title":"Google Launches Gemini Embedding 2: The First Natively Multimodal Embedding Model"}]