new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Dec 31

EarthSE: A Benchmark for Evaluating Earth Scientific Exploration Capability of LLMs

Advancements in Large Language Models (LLMs) drive interest in scientific applications, necessitating specialized benchmarks such as Earth science. Existing benchmarks either present a general science focus devoid of Earth science specificity or cover isolated subdomains, lacking holistic evaluation. Furthermore, current benchmarks typically neglect the assessment of LLMs' capabilities in open-ended scientific exploration. In this paper, we present a comprehensive and professional benchmark for the Earth sciences, designed to evaluate the capabilities of LLMs in scientific exploration within this domain, spanning from fundamental to advanced levels. Leveraging a corpus of 100,000 research papers, we first construct two Question Answering (QA) datasets: Earth-Iron, which offers extensive question coverage for broad assessment, and Earth-Silver, which features a higher level of difficulty to evaluate professional depth. These datasets encompass five Earth spheres, 114 disciplines, and 11 task categories, assessing foundational knowledge crucial for scientific exploration. Most notably, we introduce Earth-Gold with new metrics, a dataset comprising open-ended multi-turn dialogues specifically designed to evaluate the advanced capabilities of LLMs in scientific exploration, including methodology induction, limitation analysis, and concept proposal. Extensive experiments reveal limitations in 11 leading LLMs across different domains and tasks, highlighting considerable room for improvement in their scientific exploration capabilities. The benchmark is available on https://huggingface.co/ai-earth .

  • 8 authors
·
May 22

ODS: A self-reporting system for radio telescopes to coexist with adaptive satellite constellations

Low Earth orbit (LEO) satellite constellations bring broadband internet and cellular service to the most remote locations on the planet. Unfortunately, many of these locations also host some of the world's best optical and radio astronomy (RA) observatories. With the number of LEO satellites expected to increase by an order of magnitude in the upcoming decade, satellite downlink radio frequency interference (RFI) is a growing concern in protected radio-quiet areas like the United States National Radio Quiet Zone. When these satellites transmit in the spectrum near protected RA bands, undesired out-of-band emission can leak into these protected bands and impact scientific observations. In this paper, we present a self-reporting system - Operational Data Sharing (ODS) - which enables mutual awareness by publishing radio telescopes' operational information to a protected database that is available to satellite operators through a representational state transfer application programming interface (REST API). Satellite operators can use the ODS data to adapt their downlink tasking algorithms in real time to avoid overwhelming sensitive RA facilities, particularly, through the novel Telescope Boresight Avoidance (TBA) technique. Preliminary results from recent experiments between the NRAO and the SpaceX Starlink teams demonstrate the effectiveness of the ODS and TBA in reducing downlink RFI in the Karl G. Jansky Very Large Array's observations in the 1990-1995 MHz and 10.7-12.7 GHz bands. This automated ODS system is beginning to be implemented by other RA facilities and could be utilized by other satellite operators in the near future.

  • 17 authors
·
Feb 20