galah now supports media downloads for all atlases. The only exceptions are GBIF and France, for whom these APIs are not supported (yet)
dplyr
syntaxatlas_species()
) now work for Sweden, France, and Spain (#234)select()
now works for species downloads (i.e. via atlas_species()
; #185, #227)filter
, group_by
etc. not recognising fields (#237)?taxonomic_searches
(#241)galah_geolocate(type = "radius")
added. Supports filtering by point location and radius (in km) (#216)galah_geolocate()
and associated sub-functions for GBIF queriesgalah_filter()
no longer fails when assertions are specified in galah_filter()
(#199)atlas_species()
, particularly for other atlases (#234)select()
, including supporting atlas_species()
and adding new group = "taxonomy"
option (#218)collect_media()
no longer fails when a thumbnail is missing (#215)galah_filter()
parses apostrophes correctly in value names (#214)group_by() |> atlas_counts()
no longer truncates rows at 30 (#223, #198)search_values()
did not return matched valuesshow_values()
& atlas_counts()
return correctly formatted values (#233)atlas_occurrences()
no longer overwrites returned field names with user-supplied onesgalah_apply_profile()
now works as expectedshow_values()
(#235)collapse()
now returns a query
object, rather than a query_set
,
and gains a .expand
argument to optionally append a query_set
for debugging
purposes (#217).
query_set
that lists all APIs that will be pinged (collapse()
), send the queries to required APIs (compute()
), and return data as a tibble
(collect()
) (#183).galah_filter()
, galah_select()
and related functions now evaluated lazily; no API calls are made until compute()
is called, meaning that earlier programming stages are faster and easier to debug.galah_filter()
has been upgraded to use a hierarchical parsing architecture suggested by Advanced R. As a result, galah_filter()
is faster and evaluates expressions more consistently (#196, #169)galah_filter()
now supports is.na
, !
, c()
& %in%
(#196)galah_config()
for better options management (#193)slice_head()
and desc()
as masked functions to use in galah atlas_counts()
query.|
in galah_filter()
(#169)show_values()
errors nicely when API is down (#184)atlas$region
error when loading galah fixed with potions package implementation (#178)atlas_occurrences(mint_doi = TRUE)
(#182)group_by()
sometimes caused an error (#201)&
) in query results (#203)data_request
object when wrapped by a function (#207)Patch release to fix minor issues on some devel
systems on CRAN.
Minor release to address CRAN issues. Last release before 2.0.0.
Minor release to resolve issues on CRAN, and a few recent bugs.
tibble
as input to search_taxa()
(e.g., to resolve homonyms, #168)galah_select()
while atlas = GBIF (which is not supported; #181)...
in galah_filter()
(#186)An experimental feature of version 1.5.1 is the ability to call functions from other packages (#161), as synonyms for galah_
functions. These are:
identify()
({graphics}
) as a synonym for galah_identify()
select()
({dplyr}
) as a synonym for galah_select()
group_by()
({dplyr}
) as a synonym for galah_group_by()
slice_head()
({dplyr}
) as a synonym for the limit
argument in atlas_counts()
st_crop()
({sf}
) as a synonym for galah_polygon()
count()
({dplyr}
) as a synonym for atlas_counts()
These are implemented as S3 methods for objects of class data_request
, which are created by galah_call()
. Hence new function names only work when piped after galah_call()
.
The Global Biodiversity Information Facility (GBIF) is the umbrella organisation to which all other atlases supply data. Hence it is logical to be able to query GBIF and it's "nodes" (i.e. the living atlases) via a common API. Supported functions are:
search_taxa
and galah_identify
for name matchingshow_all(fields)
and show_all(assertions)
show_all()
calls that give 'collections' information are limited to 20 records by default, as GBIF datasets are often huge. search_all()
is generally more reliableshow_values()
for any GBIF fieldgalah_filter
and galah_group_by
(and therefore filter
and group_by()
, see above), but NOT galah_select
.atlas_counts()
(and therefore count()
, see above)atlas_occurrences()
& atlas_species()
; both are implemented via the 'downloads' system, meaning that queries can be larger, but may be slowThe current implementation is experimental and back-end changes are expected in future. Users who require a more stable implementation should use the {rgbif} package.
galah_config()
gains a print
function, and now uses fuzzy matching for the atlas
field to match to region, organisation or acronym (as defined by show_all(atlases)
). An example use case is to match to organisations via acronyms, e.g. galah_config(atlas = "ALA")
.readr::read_csv
in place of utils::read.csv
for improved speedshow_all
(and associated sub-functions) gain a limit
argument, set to NULL (i.e. no limit) by defaultgalah
no longer imports {data.table}
, since the only function previously used from that package (rbindlist
) is duplicated by dplyr::bind_rows
url_paginate()
to handle cases where pagination is needed, but total data length is unknown (e.g. show_all_lists()
, #170).galah_select(group = "assertions")
is always enacted properly by atlas_occurrences
, and won't lead to overly long urls (#137). When called without any other field names, recordID
is added to avoid triggering the 'default' set of columns.atlas_species
works again after some minor changes to the API; but requires a registered email to functiongalah_call()
, filtered with galah_
functions, and downloaded with atlas_
functions. Previously, this functionality was only possible with queries to the ALA (#126)atlas_media()
has been improved to use 2 simplified functions to show & download media (#145, #151):
atlas_media()
returns a tibble
of available media filescollect_media()
downloads the list of media from atlas_media()
to a local machinetype = "thumbnails"
in collect_media()
(#140)galah_geolocate()
now supports filtering queries using polygons and bounding boxes. Overall improvements and bug fixes to galah_geolocate()
through new internal functions galah_polygon()
and galah_bbox()
(#125)show_all()
and search_all()
are flexible look-up functions that can search for all information in {galah}, rather than by separate search_
/show_all_
functions (e.g. search_fields()
, search_atlases()
, show_all_fields()
, show_all_reasons()
, etc) (#127, #132)show_values()
& search_values()
(#131)galah_apply_profile()
function (#130)galah_
functions (#133)galah_geolocate()
no longer depends on archived {wellknown} package (#141)galah_filter(species != "")
or galah_filter(species == "")
(#143)collect_doi()
(#140)galah_select()
no longer adds "basic" group of columns automatically (#128)galah_config()
doesn't display incorrect preserve = TRUE
message (#136)galah_select()
(#137)atlas_counts()
and atlas_occurrences()
no longer return different record numbers when a field is empty (#138)atlas_media()
results no longer differ to results returned by galah_filter()
& atlas_counts()
(#151)ala_
functions are renamed to use the prefix atlas_
. This change reflects their functionality with international atlases (i.e., atlas_occurrences
, atlas_counts
, atlas_species
, atlas_media
, atlas_taxonomy
, atlas_citation
) (#103)select_taxa
is replaced by 3 functions: galah_identify
, search_taxa
and search_identifiers
. galah_identify
is used when building data queries, whereas search_taxa
and search_identifiers
are now exclusively used to search for taxonomic information. Syntax changes are intended to reflect their usage and expected output (#112, #122)select_
functions are renamed to use the prefix galah_
. Specifically, galah_filter
, galah_select
and galah_geolocate
replace select_filters
, select_columns
and select_locations
. These syntax changes reflect a move towards consistency with dplyr
naming and functionality (#101, #108)find_
functions that provide a listing of all possible values renamed to show_all_
(i.e., show_all_profiles
, show_all_ranks
, show_all_atlases
, show_all_cached_files
, show_all_fields
, show_all_reasons
). find_
functions that require and input and return specific results renamed to search_
(i.e., search_field_values
, search_profile_attributes
) (#112, #113)galah_group_by()
, which groups and summarises record counts based on categorical field values, similar to dplyr::group_by()
(#90, #95)galah_down_to()
+ atlas_taxonomy()
, which uses tidy evaluation like other galah_
functions (#101, #120)|>
, %>%
) by first using galah_call()
, narrowing queries with galah_
functions and finishing queries with an atlas_
function (#60, #120).galah_filter
(#91, #92)search_taxa
returns correct IDs for search terms with parentheses (#96)search_taxa
returns best-fit taxonomic result when ranks are specified in data.frame
or tibble
(#115)ala_taxonomy
no longer fails for nodes ranked as informal
or unranked
(#86)data.tree
packagegalah
ala_config()
has been renamed to galah_config()
to improve internal
consistency (#68)search_taxonomy()
provides a means to search for taxonomic names and check
the results are 'correct' before proceeding to download data via
ala_occurrences()
, ala_species()
or ala_counts()
(e.g., not ambiguous or
homonymous) (#64 #75)search_taxonomy()
returns information of author and authority of taxonomic
names (#79)search_taxonomy()
consistently orders column names, including in correct
taxonomic order by rank (#81)find_cached_files()
lists all user cached files and stored metadata (#57)clear_cached_files()
removes previously cached files and stored metadata
(#71)ala_counts()
, ala_occurrences()
, ala_media()
and ala_species()
now
have refresh_cache
argument to remove previously cached files and replace with
the current query (#71)ala_media()
caches media metadata if galah_config(caching = TRUE)
search_fields()
allows the user to pass a qid
as an argument (#59)galah_config(run_checks = FALSE)
. This
helps users avoid slowing down data request download speeds when many requests
are made in quick succession via galah_filter()
or ala_occurrences()
(#61,
#80)ala_counts()
, select_columns()
and search_fields()
now use match.arg
to approximate strings through fuzzy matching (#66)select_columns(group = 'assertions')
now sends qa = includeall
to ALA web
service API to return all assertion columns (#48)ala_occurrences()
returns data DOI when ala_occurrences(mint_doi = TRUE)
and re-downloads data when called multiple times (#56)ala_occurrences()
no longer converts field names with all-CAPS to camelCase
(#62)ala_config()
allows users to specify an international Atlas to download data
from (#21)ala_media()
includes the file path to the downloaded media in the
returned metadata (#22)ala_occurrences()
contains the search_url
used to
download records; this takes the user to the website search page (#32)ala_species()
provides a more helpful error if no species are found (#39)select_taxa()
has an optional all_ranks
argument to return intermediate
rank information (#35)select_taxa()
behaves as expected when character strings of 32 or 36
characters are provided (#23)ala_occurrences()
uses the columns
as expected
(#30)galah_filter()
negates assertion filters when required, fixing the issue
of assertion values being ignored (#27)select_taxa()
no longer throws an error when queries of more than one term
have a differing number of columns in the return value (#41)ala_counts()
returns data.frame with consistent column classes when
a group_by
parameter is called multiple times and ala_config(caching = TRUE)
(#47)ala_
functions fail gracefully if a non-id character string is passed (#49)ala_media()
now takes the same select_
arguments as other ala_
functions (#18)search_fields
now has media
as a type
argument optionverbose == TRUE
(#8)galah_location
auto-detects the type of argument provided and so takes
a single argument, query
, in place of sf
and wkt
(#17)select_taxa
auto-detects the type of argument provided and so takes a single
argument, query
, in place of term
and term_type
(#16)ala_counts
uses the group_by
field name as the returned data.frame
column name (#6)ala_occurrences
sends sourceId
parameter to ALA (#5)search_fields
provides a more helpful error for invalid types (#11)First version of galah
, built on earlier functionality from the ALA4R
package.