TL;DR: Open Pre-trained Transformers (OPT) as discussed by the authors is a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters, which they aim to fully and responsibly share with interested researchers.
Abstract: Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning. Given their computational cost, these models are difficult to replicate without significant capital. For the few that are available through APIs, no access is granted to the full model weights, making them difficult to study. We present Open Pre-trained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters, which we aim to fully and responsibly share with interested researchers. We show that OPT-175B is comparable to GPT-3, while requiring only 1/7th the carbon footprint to develop. We are also releasing our logbook detailing the infrastructure challenges we faced, along with code for experimenting with all of the released models.
TL;DR: The historical distribution of whales shown by logbook records 1785-1849 provides point features that represent the historical locations of right whale catches taken by North American pelagic whaling vessels between 1761 to 1849.
Abstract: Original provider: Wildlife Conservation Society Dataset credits: Wildlife Conservation Society Abstract: The Wildlife Conservation Society (WCS) has digitally captured the Townsend Whaling Charts that were published as a series of 4 charts with the article titled "The distribution of certain whales as shown by logbook records of American whale ships" by Charles Haskins Townsend in the journal Zoologica in 1935. The 4 charts show the locations of over 50,000 captures of 4 whale species; sperm whales (36,908), right whales (8,415), humpback whales (2,883) and bowhead whales (5,114). Capture locations were transcribed from North American (“Yankee”) pelagic whale vessel log books dating from 1761 to 1920 and plotted onto nautical charts in a Mercator projection by a cartographer. Each point plotted on the charts represents the location of a whaling ship on a day when one or more whales were taken and is symbolized by month of the year using a combination of color and open and closed circles. Townsend and his cartographer plotted vessel locations as accurately as possible according to log book records. When plotting locations on an earlier sperm whale chart published in 1931 the cartographer spaced points where locations were very dense, "extending areas slightly" for a number of whaling grounds. However, for charts in preparation at this time, Townsend states that "this difficulty is avoided by omitting some of the data, rather than extend the ground beyond actual whaling limits." We assumed that this statement refers to the 1935 charts but there is still some question as to whether the cartographer did in fact space locations and thus expand whaling grounds. Purpose: This dataset provides point features that represent the historical locations of right whale catches taken by North American pelagic whaling vessels between 1761 to 1920. Points were derived from 4 charts that were first scanned on a large format scanner at a resolution of 200 dpi. The charts were then georeferenced in the native projection of the charts, the Mercator projection, using GIS software (ESRI ArcView 3.2). Each vessel capture location plotted on the charts was then digitized as a point feature and attributed with the month of capture. One GIS file (ESRI shapefile) was then created for each whale species represented by the charts; sperm whale, right whale, humpback whale and bowhead whale. Digitizing errors include missed points, particularly from areas of dense chart locations, and incorrect assignment of month of capture because of difficulty distinguishing between chart colors. However to limit these errors multiple checks of digitized and chart locations were made and color enhancements of chart scans were used to ensure correct month assignments. Overall we are confident that at least 95% of catch locations have been digitized and that at least 95% of month attributes are correct. For full resolution digital copies of the Townsend charts please contact Gillian Woolmer (gwoolmer@wcs.org). Supplemental information: [2023-01-31] The year of the date was changed from 1913 to 1849, the midpoint of the time range of the data. WCS digitized the Townsend whaling charts in 2002 using ArcView 3.2 from ESRI. The information WCS has captured for each point location is the whale species (based on the chart) and the month, based on the chart point symbol. Exact dates and number of whales taken was not possible to determine. Right whale captures were separated into northern and southern right whale species, based on their geographic location. Since time, count, day, and year were not available, "00:00:00," 1, 1, and 1913 were used, respectively. Only month is available.
TL;DR: To develop a video assessment method for General Practitioners by analysing issues of validity, reliability and feasibility of observation of videotaped regular consultations.
Abstract: Objectives
To develop a video assessment method for General Practitioners (GPs) by analysing issues of validity, reliability and feasibility of observation of videotaped regular consultations.
Design
In a cross-sectional study consultations of 93 GPs were video recorded in the practice during 1 week. The GPs registered consultation and patient data in a logbook; 16 consultations per GP were selected using preset criteria. The quality of communicative and medical performance of these consultations was assessed by GP observers with a validated instrument. The validity of the procedure was evaluated by checking the content of each GP’s sample using specific sample criteria. Selection bias was estimated by multiple regression analysis, with sample characteristics as independent variables and scores on communication and medical performance as dependent variables. The influence of observation on GPs and patients was assessed by a questionnaire. Generalizability theory was used to estimate reliability. Feasibility was assessed by conducting a questionnaire, by keeping accounts, and by checking the technical quality of the videotaped consultations.
Setting
Universities of Nijmegen and Maastricht, The Netherlands.
Subjects
General Practitioners (GPs).
Results
The domain of general practice was well covered in the samples; content validity was satisfactory. With regard to the sample characteristics, only the total duration of consultations appeared to correlate significantly with both the score on communication and the score on medical performance. A majority (71%) of GPs reported not being influenced by the observation, except in the first cases, and recognizing their usual daily performance in the videotaped consultations. An acceptable level of reliability was reached after 2·5 hours of observation, i.e. 12 cases by a single observer. The method was well accepted by both GPs and patients. The costs were £250 per GP.
Conclusions
Video assessment of GPs in daily practice according to the procedures described is a valid and reliable method, one which is useful for education and quality improvement. There is a trade-off between feasibility on one hand and validity, reliability and credibility on the other hand. Compared to investments in observation methods in standardized settings, the costs of video observation of GPs’ actual performance are acceptable.
TL;DR: In this paper, a logbook analysis to allocate the fishing activity due to various fisheries (fleet segments) is integrated with processing of raw satellite-recorded data for identifying trips at sea and fishing sequences.