San Francisco Offers Valentine's Day City Hall Weddings
San Francisco City Hall is hosting special Valentine's Day wedding ceremonies on February 13th. The city clerk's office is performing civil ceremonies for couples throughout the day. The historic building has been decorated for the occasion to accommodate the high demand for holiday nuptials.
- Reinforcement Learning from Human Feedback (RLHF) pipelines, critical for aligning models, involve supervised fine-tuning on domain-specific data, collecting human preference data (e.g., clinicians ranking model outputs), training a reward model on these preferences, and then fine-tuning the language model with a reinforcement learning algorithm to maximize the learned reward. This process is computationally intensive, with a model like GPT-4 requiring hundreds of GPU-days for training. - Alternatives to traditional RLHF are emerging to reduce complexity and computational cost. Direct Preference Optimization (DPO) bypasses the need for a separate reward model by directly optimizing the language model on preference pairs, making it more efficient for teams with limited resources. Another variation is Reinforcement Learning from AI Feedback (RLAIF), where an AI model generates preference labels, sometimes guided by a "constitution" of principles, to reduce reliance on human-generated feedback for identifying harmful outputs. - For agentic AI, which can reason and act, evaluation moves beyond text quality to benchmarks that test real-world task completion. Key benchmarks include AgentBench for multi-turn reasoning, WebArena for web navigation tasks, and GAIA for general AI assistant capabilities. These evaluations measure metrics like task success percentage, tool invocation accuracy, and cost per task. - High-quality human feedback is the foundation of effective model alignment, and sourcing this data often involves more than just crowdsourcing. For specialized domains like finance or healthcare, AI labs rely on subject matter experts (SMEs) to validate outputs, ensuring nuanced understanding. Frameworks like Langfuse are used to collect and integrate human preferences to fine-tune smaller, more efficient models. - Synthetic data can be a cost-effective way to generate large datasets for training without compromising user privacy. It is particularly useful for augmenting real-world data in scenarios with few examples, such as predicting fraudulent transactions. However, synthetic data can lack the complexity of real-world data and may lead to model degradation if not periodically synced with real data. - The go-to-market strategy for AI infrastructure startups often involves a hybrid pricing model that combines a base subscription with usage or outcome-based tiers to provide predictability for customers while capturing upside. Since every AI query has a real compute cost, unlike traditional SaaS, pricing must account for these inference costs. - The fundraising climate for AI startups has seen significant growth, with AI capturing nearly 50% of all global funding in 2025, a substantial increase from 34% in 2024. A total of $202.3 billion was invested in the AI sector in 2025, with foundation model companies like OpenAI and Anthropic raising a significant portion of this capital. - The rise of AI is creating a new category of jobs focused on data labeling, which is foundational to model training. These roles are often remote and accessible to individuals from diverse backgrounds, serving as an entry point into the tech industry. Building a skilled data labeling workforce requires targeted training programs and can be a key differentiator in producing high-quality AI models.