User:Monkbot/task 19: cite iucn update

Task 19 was originally conceived to update, from the IUCN Red List API, the 13,000 or so articles that use {{cite IUCN}} where |url= holds an old-form IUCN url. These articles are listed in Category:cite IUCN maint (1,161).

There are several old-form urls (not all of these work):

Old-form urls are considered 'old-form' because (when they work) they always point to the current assessment.

Most of these old-form urls are used in {{cite IUCN}} templates that are found in the |status_ref= parameter of {{speciesbox}} and {{taxobox}} templates (collectively hereafter 'taxobox') to support the values in the taxobox |status= and |status_system= parameters. Because values for |status= (IUCN uses the term 'category') and for |status_system= can be extracted or derived from the results of an additional IUCN API call, task 19 was expanded to support updating these taxobox parameters.

IUCN API

edit

This task is generally slow. IUCN do not want anyone or anything hammering away at their API as fast as possible so task 19's calls to the IUCN API are spaced about 3 seconds apart. To accomplish this, the AWB Bots→Auto save→Delay setting is 3 seconds. This prevents task 19 from making edits that require only a single IUCN API call too quickly. For edits that require multiple IUCN API calls, task 19 imposes a 3-second pause before executing each IUCN API call after the first one.

IUCN API calls require a token. While the code for this task is published, the task's token is not. Anyone considering reuse of this code must obtain their own token; do not use the publicly available demo token.

Task 19 fetches data from the IUCN API in four forms; two of species data and two of species citations. These examples are for Anthus roseatus (the name) and 22718564 (the taxon id). The IUCN API returns for Anthus roseatus (name) and 22718564 (taxon id) are:

name:
{"name":"Anthus roseatus","result":[{"taxonid":22718564,"scientific_name":"Anthus roseatus","kingdom":"ANIMALIA","phylum":"CHORDATA","class":"AVES","order":"PASSERIFORMES","family":"MOTACILLIDAE","genus":"Anthus","main_common_name":"Rosy Pipit","authority":"Blyth, 1847","published_year":2019,"assessment_date":"2019-06-13","category":"LC","criteria":null,"population_trend":"Stable","marine_system":false,"freshwater_system":true,"terrestrial_system":true,"assessor":"BirdLife International","reviewer":"Smith, D.","aoo_km2":null,"eoo_km2":"3530000","elevation_upper":5000,"elevation_lower":2700,"depth_upper":null,"depth_lower":null,"errata_flag":null,"errata_reason":null,"amended_flag":null,"amended_reason":null}]}
taxon id:
{"name":"22718564","result":[{"taxonid":22718564,"scientific_name":"Anthus roseatus","kingdom":"ANIMALIA","phylum":"CHORDATA","class":"AVES","order":"PASSERIFORMES","family":"MOTACILLIDAE","genus":"Anthus","main_common_name":"Rosy Pipit","authority":"Blyth, 1847","published_year":2019,"assessment_date":"2019-06-13","category":"LC","criteria":null,"population_trend":"Stable","marine_system":false,"freshwater_system":true,"terrestrial_system":true,"assessor":"BirdLife International","reviewer":"Smith, D.","aoo_km2":null,"eoo_km2":"3530000","elevation_upper":5000,"elevation_lower":2700,"depth_upper":null,"depth_lower":null,"errata_flag":null,"errata_reason":null,"amended_flag":null,"amended_reason":null}]}

The citation data returns are:

name:
{"name":"Anthus roseatus","result":[{"citation":"BirdLife International 2019. Anthus roseatus. The IUCN Red List of Threatened Species 2019: e.T22718564A152671411. https://dx.doi.org/10.2305/IUCN.UK.2019-3.RLTS.T22718564A152671411.en .Downloaded on 21 September 2021"}]}
taxon id:
{"name":"22718564","result":[{"citation":"BirdLife International 2019. Anthus roseatus. The IUCN Red List of Threatened Species 2019: e.T22718564A152671411. https://dx.doi.org/10.2305/IUCN.UK.2019-3.RLTS.T22718564A152671411.en .Downloaded on 21 September 2021"}]}

taxobox updates

edit

Task 19 confirms, updates, or adds taxobox parameters |status=, |status_system=, and |status_ref= using data extracted from the IUCN API. The IUCN API data are fetched using a binomial species name; task 19 does not attempt to fetch IUCN API data using the taxon id found in any existing IUCN references in the taxobox. For taxobox updates, task 19 attempts to get the binomial from various taxobox parameters:

  • {{speciesbox}} parameters
    1. |taxon=
    2. |genus= + |species=
    3. |name=
  • {{taxobox}} parameters
    1. |binomial=
    2. |name=

when the taxobox has none of the above parameters, task 19 will use the article title in the IUCN API call.

Task 19 does not confirm, update, or add |status=, |status_system=, and |status_ref= when:

  • the binomial is not a binomial; usually because the taxobox or article title uses only the genus portion of the binomial
  • the IUCN API does not recognize the binomial as a valid name. When this happens task 19 adds Category:Taxobox binomials not recognized by IUCN and a hidden comment with the unrecognized binomial. Reasons that the IUCN API might not recognize the binomial are:
    • misspellings
    • typos
    • extraneous text
    • species name might not be 'globally assessed' but instead be 'regionally assessed' – the taxobox does not specify the region of an assessment so task 19 cannot use the regional form of the citation API call
    • IUCN API does not support the redirect-like behavior for binomials as the search box at https://www.iucnredlist.org/ does

{{speciesbox}} parameters |status2=, |status2_system=, and |status2_ref= are not handled in the same way as their non-enumerated counterparts. This is because there are relatively few instances of the enumerated forms (~25 according to this search 2021-09-20). |status2_ref= may be updated by subsequent task 19 processes but |status2= and |status2_system= will not be.

{{automatic taxobox}} and {{subspeciesbox}} support |status=, |status_system=, and |status_ref= but task 19 does not attempt to update these parameters as a group because the use of these parameters in those templates is comparatively rare and because species names upon which task 19 depends are inconsistent in comparison to {{speciesbox}} and {{taxobox}}. Task 19 may choose to update the content of |status_ref= in these templates if the parameter uses an old-form url or is a plain-text citation but will not attempt to update |status= and |status_system= nor will it remove duplicate |status_ref= references.

IUCN status

edit

From the IUCN API call for species data using the binomial, task 19 extracts the category value and the assessment_date value. The species IUCN status is confirmed when |status= has the same value as the category returned from the IUCN API. When they are different, task 19 updates |status= to the value from the IUCN API. When |status= is missing (because it was never there or because an empty parameter was deleted) task 19 updates |status= or adds a new |status= at the end of the taxobox. Updates, confirmation, and additions are noted in the edit summary.

IUCN status displayed on an IUCNredlist web page may be different from the category returned from the IUCN API – task 19 uses the IUCN API's category; cf. (as of 2021-09-22):

  • NT (from the Zenia insignis web page)
  • LR/nt (from the IUCN API):
    {"name":"32462","result":[{"taxonid":32462,"scientific_name":"Zenia insignis","kingdom":"PLANTAE","phylum":"TRACHEOPHYTA","class":"MAGNOLIOPSIDA","order":"FABALES","family":"FABACEAE","genus":"Zenia","main_common_name":null,"authority":"Chun","published_year":1998,"assessment_date":"1998-01-01","category":"LR/nt","criteria":null,"population_trend":null,"marine_system":false,"freshwater_system":false,"terrestrial_system":true,"assessor":"World Conservation Monitoring Centre","reviewer":"","aoo_km2":null,"eoo_km2":null,"elevation_upper":null,"elevation_lower":null,"depth_upper":null,"depth_lower":null,"errata_flag":null,"errata_reason":null,"amended_flag":null,"amended_reason":null}]}

IUCN status system

edit

To update or add a taxobox |status_system= parameter, task 19 extracts the year portion from the IUCN API's assessment_date value. If the assessment year is 2000 or earlier, task 19 sets |status_system=IUCN2.3 otherwise |status_system=IUCN3.1. The threshold date is taken from Wikipedia:Conservation status. When |status_system= is missing, task 19 adds a new parameter at the end of the taxobox. Updates and additions are noted in the edit summary, confirmations are not.

IUCN status reference

edit

To update or add |status_ref=, task 19 inspects the parameter value for a date that task 19 would have written (<ref name="iucn status date">...</ref>) or the existing citation's |access-date= (in that order). When a date can be extracted from one of these, it is compared to the current date. Task 19 will attempt to update |status_ref= only when the difference between the current date and the reference date is greater than six months or when no date can be extracted. This six-month limit was arbitrarily chosen on the presumption that IUCN updates their database twice a year.

Task 19 will not update templated citations in |status_ref= if the citation has one of:

  • |amends=<year>
  • |errata=<year>

Similarly, task 19 will not update plain-text citations in |status_ref= if the citation has one of:

  • (amended version of <year> assessment)
  • (errata version published in <year>)

This because the IUCN API does not provide the <year> of amendment or errata.

When the six month limit is met, and when the citation in |status_ref= does not hold the amended or errata parameters or strings, task 19 then inspects the associated reference tag:

  1. <ref> – unnamed reference;
    • replaces the value assigned to |status_ref= with <ref name="iucn status date"><new {{cite IUCN}} from IUCN API></ref>
    where date in name="iucn status date" is a copy of the value assigned to the new {{cite IUCN}} template's |access-date= parameter
  2. <ref name=name> – named reference:
    • replaces that reference with <ref name="iucn status date"><new {{cite IUCN}} from IUCN API></ref>
    • replaces all instances of <ref name=name /> with <ref name="iucn status date" />
    where date in name="iucn status date" is a copy of the value assigned to the new {{cite IUCN}} template's |access-date= parameter
  3. <ref name=name /> – named self-closed reference:
    • swaps the self-closed reference tag with the reference definition
    • replaces the citation as described in 2
    • if the definition was (and now the self-closed ref tag is) inside {{reflist|refs=}} then the self-closed ref tag is deleted

{{cite IUCN}} template updates

edit

For {{cite IUCN}} templates that have old-form urls, task 19 extracts the taxon id from the url and attempts to fetch citation data from the IUCN API using the taxon id. If the IUCN API does not recognize the taxon ID, task 19 will attempt to get a citation from the API by using the value assigned to |title= in the {{cite IUCN}} template. When successful, task 19 replaces the old {{cite IUCN}} template with a new {{cite IUCN}} template that has parameter values from the IUCN API citation.

When the taxon/assessment ids in a new {{cite IUCN}} template's |page= and |doi= parameters are not the same, the citation is not updated because {{cite IUCN}} will emit a |doi= / |url= mismatch error message. The mismatch is usually (usually) an indication that the assessment has errata. The citation rendered on an IUCN species web page indicates the errata year but, at the time of this writing, that value is not available in the citation returned from the IUCN API. IUCN have been notified of this discrepancy.

plain-text citation updates

edit

For the purposes of this task, plain-text references are untemplated IUCN references inside named or unnamed <ref>...</ref> tags or IUCN references as a line item in an unordered list (* markup). Task 19 will update plain-text references when it can extract a taxon id from an IUCN page identifier (e.T###A###), from an IUCN doi (as a doi inside {{doi}} or as a url), or from an IUCN url.

duplicate citations

edit

Task 19 will replace named and unnamed references that hold {{cite IUCN}} templates that match {{cite IUCN}} in |status_ref= with <ref name="iucn status date" /> tags. <ref name=name /> associated with named references that hold {{cite IUCN}} templates that match {{cite IUCN}} in |status_ref= are replaced with <ref name="iucn status date" /> tags.

Duplicate references that wholly make up an entry in an unordered list are deleted as redundant.

Task 19 does not remove any other references.

ancillary tasks

edit

Task 19 may update a {{IUCN status}} template's status value in its first positional parameter ({{{1|}}}) from the IUCN API when {{IUCN status}} has a valid taxon id as its second positional parameter ({{{2|}}}).

As with all other monkbot tasks, task 19 does not run with AWB general fixes turned on.

abandoned edits

edit

Task 19 will abandon edits when:

  • the article uses {{r}}
  • the article uses {{#tag:ref}} parser functions
  • the number of {{cite IUCN}} templates evaluated is equal to the number of IUCN API calls that returned nil values
  • the article contains {{bots|deny=monkbot/task 19}}

edit summaries

edit

Task 19 emits terse edit summaries. An edit summary is a concatenation of one or more of these message fragments:

  • IUCN status confirmed (n×) – number of taxobox |status= and {{IUCN status}} values that were confirmed to match the IUCN API returned value; when there is only one confirmation (the most common case), the parenthetical count is omitted
  • IUCN status updated (n×) – number of taxobox |status= and {{IUCN status}} values that were updated to match the IUCN API returned value; when there is only one update, the parenthetical count is omitted
  • IUCN status added – a taxobox |status= parameter was added using the IUCN API returned value
  • IUCN status system updated – a taxobox |status_system= parameter was updated to match the IUCN API returned value
  • IUCN status system added – a taxobox |status_system= parameter was added using the IUCN API returned value
  • IUCN status ref updated – a taxobox |status_ref= parameter was updated to match the IUCN API returned value
  • IUCN status ref added – a taxobox |status_ref= parameter was added using the IUCN API returned value
    • [duplicate removed] or [duplicates removed (n×)] – suffix added to 'IUCN status ref updated' or 'IUCN status ref added' messages when duplicate reference(s) have been removed
  • IUCN status ref current – the citation in |status_ref= is not older than six months
  • evaluated n template(s) – the number of {{cite IUCN}} templates that task 19 inspected for use of old-form urls
  • n template(s) modified – the number of {{cite IUCN}} templates with old-form urls that task 19 updated
  • evaluated n reference(s) – the number of plain-text references that task 19 inspected
  • n reference(s) modified – the number of plain-text references that task 19 updated
  • API species nil return (id) (n×) – emitted when IUCN API did not return species data for a given taxon id
  • API species nil return (name) (n×) – emitted when IUCN API did not return species data for a given species name
  • API cite nil return (n×) – emitted when IUCN API did not return citation data (species name or taxon id)
  • unrecognized binomial: binomial – the binomial that task 19 used to fetch data from the IUCN API for the taxobox parameter
  • (n/mm:ss.ms) – n is the number of IUCN API calls; mm:ss.ms – minutes, seconds and milliseconds required to process the article

script

edit
/*
use the iucn api to fetch IUCN categories to update {{taxobox}} and {{speciesbox}} |status= and status_system=
parameters

use the iucn api to fetch assessment citations to update {{taxobox}} and {{speciesbox}} |status_ref= parameters
with current {{cite IUCN}} templates

use the iucn api to fetch assessment citations to update {{cite IUCN}} templates with old-form urls

use the iucn api to fetch IUCN categories to update second positional parameter in {{IUCN status}} templates

source categories:
		Category:Taxonomy articles created by Polbot
		Category:cite IUCN maint

source searches:
		insource:/Downloaded on [0-3][0-9] +[JFMASOND][a-z]+ +[0-9]{4}/
		hastemplate:"cite IUCN" -incategory:"Taxobox binomials not recognized by IUCN" -insource:/iucn status [0-9]+[^0-9]+2021/
*/


//---------------------------< P R O C E S S A R T I C L E >--------------------------------------------------
//
//
//

List<string> error_log_list = new List<string>();


public string ProcessArticle(string ArticleText, string ArticleTitle, int wikiNamespace, out string Summary, out bool Skip)
	{
	Skip = false;										// assume that something will be changed

														// these use redirect to User:Monkbot/task 19: cite IUCN update
//	Summary = "[[User:Monkbot/task 19|Task 19]] (manual dev test): convert/update IUCN references to {{[[Template:cite IUCN|cite IUCN]]}} using data from [[IUCN Red List]] [[API]];";
//	Summary = "[[User:Monkbot/task 19|Task 19]] (BRFA trial): convert/update IUCN references to {{[[Template:cite IUCN|cite IUCN]]}} using data from [[IUCN Red List]] [[API]];";
	Summary = "[[User:Monkbot/task 19|Task 19]]: convert/update IUCN references to {{[[Template:cite IUCN|cite IUCN]]}} using data from [[IUCN Red List]] [[API]];";

	int		template_modified_count = 0;				// number of cite IUCN templates that were modified from the iucn api
	int		other_template_modified_count = 0;			// number of cite journal/web templates that were converted to {{cite IUCN}}

														// reset these static counters
	plain_text_modified_count = 0;						// number of plain-text citations that were modified from the iucn api
	plain_text_count = 0;								// total number of plain-text iucn references
	api_call_count = 0;									// number of api calls made
	api_fetch_fail_count = 0;							// number of api fetches that failed
	api_no_cite_return_count = 0;						// number of times that the api returned a non-citation value

	api_no_species_return_name_count = 0;				// number of times that the api returned a non-species value (species binomial)
	api_no_species_return_id_count = 0;					// number of times that the api returned a non-species value (species id for {{IUCN status}})
	iucn_status_confirmed_count = 0;					// number of times that we confirmed the iucn status in taxobox-like templates
	iucn_status_updated_count = 0;						// number of times that we updated the iucn status in taxobox-like templates
	iucn_status_system_updated_count = 0;				// number of times that we updated the iucn status system in taxobox-like templates
	iucn_template_count = 0;							// total number of cite IUCN templates
	other_template_count = 0;							// total number of cite journal/web templates


	parse_fail_count = 0;								// number of times that we couldn't parse the api return
	page_doi_skip_count = 0;							// number of templates or plain-text references skipped because page and doi assessment ID mismatch

	status_added = false;								// set to true when |status= created
	status_system_added = false;						// set to true when |status_system created
	status_ref_added = false;							// set to true when |status_ref= created
	status_ref_updated = false;							// set to true when |status_ref= updated
	status_ref_current = false;							// set to true when |status_ref= less than 6 months old
	duplicates_removed_count = 0;						// number of duplicate status references removed

	taxobox_blank = null;								// gets blank taxobox as flag
	unrecognized_species_name = null;					// gets taxobox species name that IUCN doesn't recognize


	System.Diagnostics.Stopwatch stopwatch = new System.Diagnostics.Stopwatch();		// set up a stopwatch
	stopwatch.Start();																	// and start it

	if (Regex.Match (ArticleText, @"\{\{\s*#tag:ref").Success)
		{
		Summary = "Article uses {{#tag:ref}} parser function(s)";
		error_log_add ("Article uses " + code_nowiki("{{#tag:ref}}") + " parser function(s)");		// add error message to list
		log_errors (ArticleTitle, error_log_list);										// dump list to the log file
		Skip = true;
		return ArticleText;
		}

	if (Regex.Match (ArticleText, @"\{\{\s*[Rr]\s*\|").Success)
		{
		Summary = "Article has {{r}} template(s)";
		error_log_add ("Article has " + code_nowiki("{{r}}") + " template(s)");			// add error message to list
		log_errors (ArticleTitle, error_log_list);										// dump list to the log file
		Skip = true;
		return ArticleText;
		}

	if (null == api_token)
		{
		System.IO.StreamReader sr = new System.IO.StreamReader (iucn_api_token_file);	// open the api token file for reading
		api_token = "?token=" + sr.ReadLine();											// read the token (must be the only thing in the file)
		sr.Close();																		// close and done
		}

	if (null == api_token)																// but just in case
		{
		Summary = "Failed to read: " + iucn_api_token_file;								// announce failure
		error_log_add ("Failed to read: " + iucn_api_token_file);						// add error message to list
		log_errors (ArticleTitle, error_log_list);										// dump list to the log file
		Skip = true;
		return ArticleText;
		}

	ArticleText = Regex.Replace (ArticleText, @"[\r\n]+\[\[Category:Taxobox binomials not recognized by IUCN\]\][^\r\n]*", "");		// remove if present; will be restored if necessary


//---------------------------< T A X O B O X >----------------------------------------------------------------
//
// <taxobox> holds the content of {{taxobox}} or {{Speciesbox}} and then is modified by taxobox_update().  The
// source template in <ArticleText> is replaced with an empty skeleton ('{{taxobox}}' or '{{Speciesbox}}' but
// without contents.  At the end, this skeleton is replaced with the modified taxobox held in <taxobox>.
//
// The reason for this round-about is to prevent other portions of this script from evaluating and tallying
// the reference in |status_ref=.  Also permits easy replacement of references that duplicate the reference in
// |status_ref=.
//

	ArticleText = Regex.Replace (ArticleText, hide_non_ref_tag_pattern, hide_non_ref_replace_val);

	ArticleText = hide (ArticleText, HIDE_ALL_BUT_TAXOBOX);								// hide all templates except taxobox-like templates
	ArticleText = hide (ArticleText, HIDE_ALL_BUT_TAXOBOX);								// hide all templates except taxobox-like templates
//if (1 == 1) return ArticleText;
	string taxobox = taxobox_get (ArticleText);
	taxobox_status_ref = null;															// reset the 'new' value for |status_ref; used at the end to remove duplicates
	taxobox_status_ref_open_tag = null;													// its matching ref open tag
	taxobox_status_ref_sc_tag = null;													// and its matching self-closed tag

	taxobox_update (ref taxobox, ref ArticleText, ArticleTitle);						// update the taxobox |status=, |status_system=, and |status_ref=

	ArticleText = unhide (ArticleText);


//---------------------------< C I T E   I U C N   U P D A T E S >--------------------------------------------
//
// this segment updates {{cite IUCN}} templates that have old-form urls.  There are a variety of old-form urls
// but the most common indicator is the taxon id followed by a zero (0) for the assessment id.  This section
// fetches the current citation from the IUCN API using the taxon id (preferred) or the using the 'name' in |title=.
// The 'name' in |title= is presumed to be an italicized binomial
//
// {{cite IUCN}} templates with |ref= holding any value retain the parameter so that {{sfn}} or {{harv}} links
// aren't broken.  Any replacement citation that does not use |ref= may have a different author list from the
// 'original' so, when the underlying {{cite journal}} creates a CITEREF id for the new name list, the {{sfn}}
// or {{harv}} links will be broken ...
//
// does not update references in the taxobox (|status_ref= handled above); example: [[Picea abies]]
//

	ArticleText = hide (ArticleText, IS_CITE_IUCN);									// hide all templates except cite IUCN templates

	if (Regex.Match (ArticleText, iucn_template_pattern).Success)
		ArticleText = Regex.Replace (ArticleText, iucn_template_pattern,
			delegate(Match match)
				{
				string	template = match.Groups[0].Value;							// this will be returned if no changes
				string	ref_param = null;

				iucn_template_count++;												// bump total number of cite IUCN templates tally

				string id = taxon_id_from_old_form_url_get (template);

				if (null == id)														// not an old-form-url template so ignore it
					return template;

				if (Regex.Match (template, @"__P1P3__\s*(?:errata|amends)\s*=\s*\d{4}").Success)
					{
					error_log_add ("[cite IUCN update]: template has |errata= or |amends= parameter (id: " + id + ")");
					return template;
					}

				string name = null;
				if (Regex.Match (template, iucn_title).Success)
					{
					name = Regex.Match (template, iucn_title).Groups[1].Value.Trim();
					name = species_name_cleanup (name);								// remove markup, extinction markers, disambiguation, etc
					}

				string api_url_id = api_id_url + id + api_token;					// build the url from its various parts
				string api_url_name = api_name_url + name + api_token;				// build the url from its various parts

				string cite_iucn = cite_iucn_get (api_url_id, api_url_name, ArticleTitle, id, name);
				if (null == cite_iucn)
					return template;

				template = Regex.Replace (template, ref_param_empty, "$1");			// remove empty |ref= parameters from template

				if (Regex.Match (template, ref_param_not_empty).Success)			// if this template has |ref=<something>
					ref_param = Regex.Match (template, ref_param_not_empty).Groups[1].Value.Trim();	// get its assigned value

				if (null != ref_param)
					cite_iucn = Regex.Replace (cite_iucn, @"(\}\})", " |ref=" + ref_param + "$1");	// add the preexisting |ref= param

				template_modified_count++;
				return cite_iucn;
				});

	ArticleText = unhide (ArticleText);												// unhide all that is hidden


//---------------------------< C I T E   J O U R N A L / W E B   U P D A T E S >------------------------------
//
// this segment updates {{cite journal}} abd {{cite web}} templates that have iucn urls, or pages or dois.  This
// section fetches the current citation from the IUCN API using the taxon id (preferred) or the using the 'name'
// in |title=.  The 'name' in |title= is presumed to be an italicized binomial
//
// {{cite journal}} and {{cite web}} templates with |ref= holding any value retain the parameter so that {{sfn}}
// or {{harv}} links aren't broken.  Any replacement {{cite IUCN}} that does not use |ref= may have a different
// author list from the 'original' so, when the underlying {{cite journal}} creates a CITEREF id for the new name
// list, the {{sfn}} or {{harv}} links will be broken ...
//
// does not update references in the taxobox (|status_ref= handled above)
//

	ArticleText = hide (ArticleText, IS_CITE_OTHER);							// hide all templates except cite journal and cite web templates

	if (Regex.Match (ArticleText, other_template_pattern).Success)
		ArticleText = Regex.Replace (ArticleText, other_template_pattern,
			delegate(Match match)
				{
				string	template = match.Groups[0].Value;							// this will be returned if no changes
				string	ref_param = null;

				other_template_count++;												// bump total number of cite journal/web templates tally

				string	id = plain_text_taxon_id_get (template);					// attempt to get taxon id from page -> doi -> url

				if (null == id)														// not an 'iucn' template so ignore it
					return template;

//	cite journal and cite web don't support |errata= or |amends=
//				if (Regex.Match (template, @"__P1P3__\s*(?:errata|amends)\s*=\s*\d{4}").Success)
//					{
//					error_log_add ("[cite IUCN update]: template has |errata= or |amends= parameter (id: " + id + ")");
//					return template;
//					}

				string name = null;
				if (Regex.Match (template, iucn_title).Success)						// get value assigned to |title=
					{
					name = Regex.Match (template, iucn_title).Groups[1].Value.Trim();
					name = species_name_cleanup (name);								// remove markup, extinction markers, disambiguation, etc
					}

				string api_url_id = api_id_url + id + api_token;					// build the api url from its various parts
				string api_url_name = api_name_url + name + api_token;				// build the api url from its various parts

				string cite_iucn = cite_iucn_get (api_url_id, api_url_name, ArticleTitle, id, name);
				if (null == cite_iucn)
					return template;

				template = Regex.Replace (template, ref_param_empty, "$1");			// remove empty |ref= parameters from template

				if (Regex.Match (template, ref_param_not_empty).Success)			// if this template has |ref=<something>
					ref_param = Regex.Match (template, ref_param_not_empty).Groups[1].Value.Trim();	// get its assigned value

				if (null != ref_param)
					cite_iucn = Regex.Replace (cite_iucn, @"(\}\})", " |ref=" + ref_param + "$1");	// add the preexisting |ref= param

				other_template_modified_count++;
				return cite_iucn;
				});

	ArticleText = unhide (ArticleText);												// unhide all that is hidden


//---------------------------< P L A I N _ T E X T _ R E F _ U P D A T E >------------------------------------
//
// update plain-text references first in ArticleText and then in the taxobox
//

	ArticleText = plain_text_ref_update (ArticleText, ArticleTitle);
																					// all of these create or rely on <ref iucn status <'date'>>{{cite IUCN}}
	if ((status_added || (0 != iucn_status_confirmed_count) || (0 != iucn_status_updated_count)) && (status_ref_added || status_ref_updated || status_ref_current))
		taxobox = plain_text_ref_update (taxobox, ArticleTitle);					// do not update plain-text references in taxobox because |status_ref= might be plain text


//---------------------------< I U C N   P L A I N - T E X T   B I B L I O G R A P H Y   U P D A T E >--------
//
// this is the plain-text form API id only.  Plain-text references in bibliographies must be in unordered list
// markup \n*...\n
//
// known issues:
//		because this attempts to locate 'correct' plain-text citations and because any non-template and non-
//		wikilink text is plain text, plain text that is part of the unordered list item that is not part of the
//		actual IUCN citation will be treated as part of the citation and will be replaced with the {{cite IUCN}}
//		template if the API returns a citation for the taxon id.
//

	if (Regex.Match (ArticleText, plain_text_bib_pattern).Success)					// must have the form \n*plain text\n must be constrained because article is plain text
		ArticleText = Regex.Replace (ArticleText, plain_text_bib_pattern,
			delegate(Match match)
				{
				string	plain_text = match.Groups[0].Value;							// this will be returned if no changes

				string	taxon_id = plain_text_taxon_id_get (plain_text);			// attempt to get taxon id
				if (null == taxon_id)
					return plain_text;												// no taxon id so abandon

				if (is_plain_text_rejected (plain_text))							// returns true when plain_text is rejected
					return plain_text;

				string	ref_open = match.Groups[1].Value;							// the opening \n*
				string	ref_close = match.Groups[3].Value;							// the closing \n tag

				plain_text_count++;													// bump total number of plain-text references found

				string api_url = api_id_url + taxon_id + api_token;					// build the url from its various parts
				string cite_iucn = cite_iucn_get (api_url, null, ArticleTitle, taxon_id, null);	// go build a {{cite IUCN}} template from the api

				if (null == cite_iucn)
					return plain_text;												// template build failed

				plain_text_modified_count++;
				return ref_open + cite_iucn + ref_close;
				});


//---------------------------< I U C N   S T A T U S   T E M P L A T E >--------------------------------------
//
// Update status in {{IUCN status|<status>|<taxon id>|<options>}}
//

	if (Regex.Match (ArticleText, iucn_status_template_pattern).Success)
		ArticleText = Regex.Replace (ArticleText, iucn_status_template_pattern,
			delegate(Match match)
				{
				string template = match.Groups[0].Value;							// if no change, return this
				string status = null;
				string id = null;

				if (Regex.Match (template, iucn_status_status).Success)
					status = Regex.Match (template, iucn_status_status).Groups[2].Value;
				else
					return template;

				if (Regex.Match (template, iucn_status_id).Success)
					id = Regex.Match (template, iucn_status_id).Groups[2].Value;
				else
					return template;

				string species_from_api;											// species data from the API will go here
				string api_url = api_species_id_url + id + api_token;				// build the url from its various parts

				species_from_api = api_fetch (api_url, ArticleTitle);				// fetch species data from the IUCN API

				if (null == species_from_api)										// if api_fetch() failed
					return template;

				string	status_from_api = null;
				if (Regex.Match (species_from_api, status_from_api_pattern).Success)
					status_from_api = Regex.Match (species_from_api, status_from_api_pattern).Groups[1].Value;
				else
					{
					error_log_add ("[iucn status template]: API did not return species data: " + code_nowiki (species_from_api));
					api_no_species_return_id_count++;
					return template;
					}

				if (status == status_from_api)										// if status same as api status
					iucn_status_confirmed_count++;									// bump the confirmed count and done
				else
					{
					template = Regex.Replace (template, iucn_status_lead + status, "$1" + status_from_api);	// update
					iucn_status_updated_count++;									// bump the updated count
					}

				return template;
				});


//--------------------------- R E M O V E   D U P L I C A T E   S T A T U S   R E F >-------------------------
//
// convert |status_ref= {{cite IUCN}} template into a regex to find duplicates of itself in ArticleText and
// then replace any duplicates with the |status_ref= self-closed tag from |status_ref=
//
// replaces duplicates in taxobox only after hiding the |status_ref= definition so that we don't lose the definition
//
// problem: if the duplicate is named and is the definition for other self-closed ref tags, all of those tags
// need to be renamed ... argh example: [[Bellamya trochlearis]], [[Catarina pupfish]]
//

	if ((null != taxobox_status_ref) && (null != taxobox_status_ref_sc_tag))
		{
		string taxobox_status_ref_pattern = taxobox_status_ref;

		foreach (string symbol in symbols)
			taxobox_status_ref_pattern = Regex.Replace (taxobox_status_ref_pattern, symbol, symbol);			// convert taxobox_status_ref to a regex pattern

																												// references in unordered lists always ok to replace
		ArticleText = counted_replace (ArticleText, bib_open_ul + taxobox_status_ref_pattern + bib_close_ul, "$1", ref duplicates_removed_count);

																												// references with unnamed <ref> tags always ok to replace
		ArticleText = counted_replace (ArticleText, ref_open_tag_unnamed + @"\s*" + taxobox_status_ref_pattern + @"\s*" + ref_close_tag, taxobox_status_ref_sc_tag, ref duplicates_removed_count);
		taxobox = counted_replace (taxobox, ref_open_tag_unnamed + @"\s*" + taxobox_status_ref_pattern + @"\s*" + ref_close_tag, taxobox_status_ref_sc_tag, ref duplicates_removed_count);

		taxobox = hide_taxobox_status_ref (taxobox, taxobox_status_ref_open_tag, taxobox_status_ref_pattern);	// hide |status_ref= {{cite IUCN}} template so we don't replace it with sc tag

		named_status_ref_dup_remove (ref ArticleText, ref taxobox, taxobox_status_ref_pattern, taxobox_status_ref_sc_tag);	// remove duplicates

																										// remove sequential instances of taxobox_status_ref_open_tag_sc TODO: this could be improved
		string taxobox_status_ref_open_tag_sc = Regex.Replace (taxobox_status_ref_open_tag, @"([^\>]+)\>", "$1 />");

		taxobox = Regex.Replace (taxobox, taxobox_status_ref_open_tag_sc + @"\s*" + taxobox_status_ref_open_tag_sc, taxobox_status_ref_sc_tag);
		ArticleText = Regex.Replace (ArticleText, taxobox_status_ref_open_tag_sc + @"\s*" + taxobox_status_ref_open_tag_sc, taxobox_status_ref_sc_tag);
		}


//---------------------------< C L E A N U P >----------------------------------------------------------------

	if (null != taxobox)
		taxobox = unhide (taxobox);

	ArticleText = hide (ArticleText, "[Rr]eflist");

	while (Regex.Match (ArticleText, reflist_cleanup).Success)							// remove self-closed ref tags from {{reflist}} (European fire-bellied toad)
		{
		ArticleText = Regex.Replace (ArticleText, reflist_cleanup, "$1");
		ArticleText = Regex.Replace (ArticleText, @"(\{\{)\s*([Rr]eflist[^\|]*)\s*\|\s*refs\s*=\s*(\}\})", "$1$2$3");
		}

	ArticleText = unhide (ArticleText);

	if (null != taxobox)
		ArticleText = Regex.Replace (ArticleText, taxobox_blank_pattern, taxobox);

	ArticleText = Regex.Replace (ArticleText, angle_open, "<");
	ArticleText = Regex.Replace (ArticleText, angle_close, ">");


//---------------------------< F I N I S H >------------------------------------------------------------------

	if (status_added)															// build our edit summary
		Summary = summary_concat (Summary, " IUCN status added;");
	if (0 != iucn_status_confirmed_count)										// build our edit summary
		Summary = summary_concat (Summary, " IUCN status confirmed" + ((1 < iucn_status_confirmed_count) ? " (" + iucn_status_confirmed_count + "×);" : ";"));
	if (0 != iucn_status_updated_count)
		Summary = summary_concat (Summary, " IUCN status updated" + ((1 < iucn_status_updated_count) ? " (" + iucn_status_updated_count + "×);" : ";"));

	if ((0 != iucn_status_confirmed_count) || (0 != iucn_status_updated_count) || status_added)
		{
		if (0 != iucn_status_system_updated_count)
			Summary = summary_concat (Summary, " IUCN status system updated;");
		else if (status_system_added)
			Summary = summary_concat (Summary, " IUCN status system added;");
		}

	string dup_text = "";
	switch (duplicates_removed_count)
		{
		case 0:
			dup_text = ";";
			break;
		case 1:
			dup_text = " [duplicate removed];";
			break;
		default:
			dup_text = " [duplicates removed (" + duplicates_removed_count + "×)];";
			break;
		}

	if (status_ref_added)
		Summary = summary_concat (Summary, " IUCN status ref added" + dup_text);

	if (status_ref_updated)
		Summary = summary_concat (Summary, "  IUCN status ref updated" + dup_text);

	if (status_ref_current)
		Summary = summary_concat (Summary, "  IUCN status ref current;");


	if (0 != plain_text_count)													// build our edit summary
		{
		Summary = summary_concat (Summary, " evaluated " + plain_text_count + " reference" + (1 == plain_text_count ? ";" : "s;"));

		if (0 != plain_text_modified_count)
			Summary = summary_concat (Summary, " " + plain_text_modified_count + " reference" + (1 == plain_text_modified_count ? " " : "s ") + "modified;");
		}

	if (0 != iucn_template_count)
		{
		Summary = summary_concat (Summary, " evaluated " + iucn_template_count + " {{cite IUCN}}" + (1 == iucn_template_count ? ";" : "s;"));

		if (0 != template_modified_count)
			Summary = summary_concat (Summary, " " + template_modified_count + " template" + (1 == template_modified_count ? " " : "s ") + "modified;");
		}

	if ((0 != other_template_count) && (0 != other_template_modified_count))	// only report 'other templates' when we modify
		{
		Summary = summary_concat (Summary, " evaluated " + other_template_count + " other template" + (1 == other_template_count ? ";" : "s;"));

		if (0 != other_template_modified_count)
			Summary = summary_concat (Summary, " " + other_template_modified_count + " template" + (1 == other_template_modified_count ? " " : "s ") + "modified;");
		}

	if (0 != page_doi_skip_count)
		Summary = summary_concat (Summary, " skipped doi/page mismatch (" + page_doi_skip_count + "×);");

	if (0 != api_no_cite_return_count)
		Summary = summary_concat (Summary, " API cite nil return (" + api_no_cite_return_count + "×);");

	if (0 != api_no_species_return_id_count)									// for {{IUCN status}}
		Summary = summary_concat (Summary, " API species nil return (id) (" + api_no_species_return_id_count + "×);");

	if (0 != api_no_species_return_name_count)
		Summary = summary_concat (Summary, " API species nil return (name) (" + api_no_species_return_name_count + "×);");

	if (null != unrecognized_species_name)
		Summary = summary_concat (Summary, " unrecognized binomial: " + unrecognized_species_name + ";");

	stopwatch.Stop();															// stop the stopwatch
	TimeSpan ts = stopwatch.Elapsed;											// get the elapsed time and tack it onto the edit summary

	Summary = Summary + " (" + api_call_count + "/" + String.Format("{0:00}:{1:00}.{2:00}", ts.Minutes, ts.Seconds, ts.Milliseconds / 10) + ");";

	if (!status_ref_added && !status_ref_updated && (0 == iucn_status_updated_count))	// iucn_status_updated_count for {{IUCN status}} updates (List of reptiles of North America)
		{
		if (0 == iucn_template_count)
			{
			if ((0 != plain_text_count) && (plain_text_count == page_doi_skip_count))
				{
				error_log_add ("auto-skipped: doi/page mismatch");
				Skip = true;
				}

			if ((0 != plain_text_count) && (plain_text_count == api_no_cite_return_count))
				{
				error_log_add ("auto-skipped: number of cite IUCN templates is same as number of API citation nil returns");
				Skip = true;
				}
			}
		if (0 == plain_text_count)
			{
			if ((0 != iucn_template_count) && (iucn_template_count == page_doi_skip_count))
				{
				error_log_add ("auto-skipped: doi/page mismatch");
				Skip = true;
				}

			if ((0 != iucn_template_count) && (iucn_template_count == api_no_cite_return_count))
				{
				error_log_add ("auto-skipped: number of plain-text citations is same as number of API citation nil returns");
				Skip = true;
				}
			}
		}

	if ("" == ArticleText)														// trap to see if the 'blanked' pages that sometimes occur are the fault of this script
		{
		error_log_add ("auto-skipped: ArticleText is empty string");			// error message
		Skip = true;															// force a skip
		}

	if (0 != error_log_list.Count)
		log_errors (ArticleTitle, error_log_list);
	return ArticleText;
	}


//===========================<< S U P P O R T >>==============================================================

//---------------------------< N A M E D _ S T A T U S _ R E F _ D U P _ R E M O V E >------------------------
//
//
//

//private string named_status_ref_dup_remove (ref string text, string taxobox_status_ref_pattern, string taxobox_status_ref_sc_tag)
//	{
//	Match dup_match = Regex.Match (text, @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?([^""\>]+)""?\>\s*" + taxobox_status_ref_pattern + @"\s*\</[Rr][Ee][Ff]\>");
//	if (dup_match.Success)
//		{
//		string name = dup_match.Groups[1].Value;													// get the reference's name from <ref name=...> tag
//		string ref_tag_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""""?" + name + @"""""?\s*\>";	// make a <ref name=... > pattern from name
//		string sc_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""""?" + name + @"""""?\s*/\>";	// make a self-closed <ref name=... /> pattern from name

//		text = Regex.Replace (text, sc_replace_pattern, taxobox_status_ref_sc_tag);					// replace any <ref name=... /> with <ref name="iucn status <date> /> sc tag
//		text = counted_replace (text, ref_open_tag_named + @"\s*" + taxobox_status_ref_pattern + @"\s*" + ref_close_tag, taxobox_status_ref_sc_tag, ref duplicates_removed_count);	// now remove any duplicates

//		return sc_replace_pattern;
//		}
//	return null;
//	}

private void named_status_ref_dup_remove (ref string article_text, ref string taxobox, string taxobox_status_ref_pattern, string taxobox_status_ref_sc_tag)
	{
	Match dup_match;
	string name = null;
	string ref_tag_replace_pattern = null;
	string sc_replace_pattern = null;

	dup_match = Regex.Match (taxobox, @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?([^""\>]+)""?\>\s*" + taxobox_status_ref_pattern + @"\s*\</[Rr][Ee][Ff]\>");
	while (dup_match.Success)
		{
		name = dup_match.Groups[1].Value;																		// get the reference's name from <ref name=...> tag

		ref_tag_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""""?" + name + @"""""?\s*\>";					// make a <ref name=... > pattern from name
		sc_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""""?" + name + @"""""?\s*/\>";						// make a self-closed <ref name=... /> pattern from name

		taxobox = Regex.Replace (taxobox, sc_replace_pattern, taxobox_status_ref_sc_tag);						// replace any <ref name=... /> with <ref name="iucn status <date> /> sc tag
		article_text = Regex.Replace (article_text, sc_replace_pattern, taxobox_status_ref_sc_tag);				// replace any <ref name=... /> with <ref name="iucn status <date> /> sc tag

		taxobox = counted_replace (taxobox, ref_tag_replace_pattern + @"\s*" + taxobox_status_ref_pattern + @"\s*" + ref_close_tag, taxobox_status_ref_sc_tag, ref duplicates_removed_count);	// now remove any duplicates

		dup_match = Regex.Match (taxobox, @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?([^""\>]+)""?\>\s*" + taxobox_status_ref_pattern + @"\s*\</[Rr][Ee][Ff]\>");
		}

	dup_match = Regex.Match (article_text, @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?([^""\>]+)""?\>\s*" + taxobox_status_ref_pattern + @"\s*\</[Rr][Ee][Ff]\>");
	while (dup_match.Success)
		{
		name = dup_match.Groups[1].Value;																		// get the reference's name from <ref name=...> tag

		ref_tag_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?" + name + @"""?\s*\>";					// make a <ref name=... > pattern from name
		sc_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""""?" + name + @"""""?\s*/\>";						// make a self-closed <ref name=... /> pattern from name

		article_text = Regex.Replace (article_text, sc_replace_pattern, taxobox_status_ref_sc_tag);				// replace any <ref name=... /> with <ref name="iucn status <date> /> sc tag
		taxobox = Regex.Replace (taxobox, sc_replace_pattern, taxobox_status_ref_sc_tag);						// replace any <ref name=... /> with <ref name="iucn status <date> /> sc tag

		article_text = counted_replace (article_text, ref_tag_replace_pattern + @"\s*" + taxobox_status_ref_pattern + @"\s*" + ref_close_tag, taxobox_status_ref_sc_tag, ref duplicates_removed_count);	// now remove any duplicates

		dup_match = Regex.Match (article_text, @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?([^""\>]+)""?\>\s*" + taxobox_status_ref_pattern + @"\s*\</[Rr][Ee][Ff]\>");
		}
	}


//---------------------------< H I D E _ T A X O B O X _ S T A T U S _ R E F >--------------------------------
//
//
//

private string hide_taxobox_status_ref (string taxobox, string taxobox_status_ref_open_tag, string taxobox_status_ref_pattern)
	{
	Match dup_match = Regex.Match (taxobox, "(" + taxobox_status_ref_open_tag +")(" + taxobox_status_ref_pattern + ")");	// look for and capture |status_ref= definition
	if (dup_match.Success)
		{
		string hidden_status_ref = hide (dup_match.Groups[2].Value, IS_TAXOBOX);							// spoof to hide {{cite IUCN}} in |status_ref=
		return Regex.Replace (taxobox, "(" + taxobox_status_ref_open_tag +")(" + taxobox_status_ref_pattern + ")", "$1" + hidden_status_ref);	// replace with the hidden definition
		}

	return taxobox;
	}




//---------------------------< I U C N   P L A I N - T E X T   R E F E R E N C E   U P D A T E >--------------
//
// this is the plain-text form API id only.  Plain-text citations must be wrapped with <ref ...>...</ref> tags
//
// known issues:
//		because this attempts to locate 'correct' plain-text citations and because any non-template and non-
//		wikilink text is plain text, plain text inside <ref ...>...</ref> that is not part of the actual IUCN
//		citation will be treated as part of the citation and will be replaced with the {{cite IUCN}} template
//		if the API returns a citation for the taxon id.
//
//		does not update plain-text references in the taxobox (|status_ref= handled above); example: [[Picea abies]]
//

private string plain_text_ref_update (string text, string article_title)
	{
	if (Regex.Match (text, plain_text_ref_pattern).Success)							// must have the form <ref ...>plain text</ref> must be constrained because article is plain text
		text = Regex.Replace (text, plain_text_ref_pattern,
			delegate(Match match)
				{
				string	plain_text = match.Groups[0].Value;							// this will be returned if no changes

				string	taxon_id = plain_text_taxon_id_get (plain_text);			// attempt to get taxon id
				if (null == taxon_id)
					return plain_text;												// no taxon id so abandon

				if (is_plain_text_rejected (plain_text))							// returns true when plain_text is rejected
					return plain_text;

				string	ref_open = match.Groups[1].Value.Trim();					// the opening <ref> tag
				string	ref_close = match.Groups[3].Value.Trim();					// the closing </ref> tag

				plain_text_count++;													// bump total number of plain-text references found

				string api_url = api_id_url + taxon_id + api_token;					// build the url from its various parts
				string cite_iucn = cite_iucn_get (api_url, null, article_title, taxon_id, null);	// go build a {{cite IUCN}} template from the api

				if (null == cite_iucn)
					return plain_text;												// template build failed

				plain_text_modified_count++;
				return ref_open + cite_iucn + ref_close;
				});

	return text;
	}


//---------------------------< T A X O B O X _ G E T >--------------------------------------------------------
//
// gets the {{taxobox}} or {{speciesbox}} template from <article_text>
//

private string taxobox_get (string article_text)
	{
	if (Regex.Match (article_text, taxobox_template_pattern).Success)
		return Regex.Match (article_text, taxobox_template_pattern).Groups[0].Value;

	return null;
	}


//---------------------------< T A X O B O X _ U P D A T E >--------------------------------------------------
//
// updates |status=, |status_system=, and |status_ref= parameters; returns true when updated; false else
//

private bool taxobox_update (ref string taxobox, ref string article_text, string article_title)
	{
	if (null == taxobox)																// if no taxobox
		return false;

	taxobox_blank = Regex.Replace (taxobox, taxobox_template_pattern, "$1$3");

	taxobox = Regex.Replace (taxobox, stray_dot, "$1");									// delete stray . because I found one such
	taxobox = Regex.Replace (taxobox, stray_splat, "$1");								// delete stray * because I found one such
	taxobox = Regex.Replace (taxobox, stray_equal, "$1");								// delete stray = because I found one such
	taxobox = Regex.Replace (taxobox, stray_nbsp, "$1");								// delete stray &nbsp; because I found one such
	taxobox = Regex.Replace (taxobox, html_comment, "$1");								// and html comments (Euconocephalus remotus)

	string	taxobox_status_val = null;
	string	taxobox_status_system_val = null;
	string	taxobox_status_ref_val = null;
	string	taxobox_status_ref_type = null;
	string	taxobox_status_ref_name = null;												// original name from <ref name="original name"> or <ref name="original name" />
	bool	taxobox_status_ref_is_empty = false;
	string	taxobox_status_date = null;
	int		taxobox_status_date_diff = 100;
	string	taxobox_species_name_val = null;

	string	api_status_val = null;
	string	api_status_system_val = null;

	taxobox_species_name_val = taxobox_species_name_get (taxobox, article_title);		// get species name from taxobox or article title
	if (api_species_data_get (taxobox_species_name_val, ref api_status_val, ref api_status_system_val, article_title))
		{																								// when here presume that we can also get citation data from api
		taxobox_status_val = taxobox_status_get (taxobox);
		taxobox_status_system_val = taxobox_system_get (taxobox);

		if ((((null != taxobox_status_val) && is_iucn_status (taxobox_status_val)) ||					// has a value that is an IUCN status or
			((null != taxobox_status_system_val) && is_iucn_system (taxobox_status_system_val))) ||		// has a value that is an IUCN system or
			((null == taxobox_status_val) && (null == taxobox_status_system_val)))						// both are missing or empty
				{
				taxobox_status_update (ref taxobox, api_status_val, taxobox_status_val);
				taxobox_system_update (ref taxobox, api_status_system_val, taxobox_status_system_val);
				}
		else
			return false;

		taxobox_status_ref_val = taxobox_status_ref_get (taxobox, ref taxobox_status_ref_is_empty);

		if (null != taxobox_status_ref_val)
			{
			if (Regex.Match (taxobox_status_ref_val, amended_text).Success)
				{
				error_log_add ("taxobox_update(): plain-text |status_ref= has amended text");
				return false;
				}

			if (Regex.Match (taxobox_status_ref_val, errata_text).Success)
				{
				error_log_add ("taxobox_update(): plain-text |status_ref= has errata text");
				return false;
				}

			if (Regex.Match (taxobox_status_ref_val, @"__P1P3__\s*(?:errata|amends)\s*=\s*\d{4}").Success)
				{
				error_log_add ("taxobox_update(): |status_ref= citation has |errata= or |amends= parameter");
				return false;
				}
			}

		taxobox_status_ref_type = taxobox_status_ref_type_get (taxobox_status_ref_val, ref taxobox_status_ref_name);

		string api_url = null;

		if (("named" == taxobox_status_ref_type) || ("unnamed" == taxobox_status_ref_type) || (null == taxobox_status_ref_type))
			{
			if (null != taxobox_status_ref_val)
				{
				taxobox_status_date = taxobox_status_date_get (taxobox_status_ref_val, taxobox_status_ref_name);
				taxobox_status_date_diff = taxobox_status_date_diff_get (taxobox_status_date);
				}

			if (6 < taxobox_status_date_diff)
				{
				api_url = api_name_url + taxobox_species_name_val + api_token;			// build citation url from its various parts
				taxobox_status_ref = cite_iucn_get (api_url, null, article_title, null, taxobox_species_name_val);	// go build a {{cite IUCN}} template from the api

				if (null == taxobox_status_ref)
					return false;														// template build failed

				new_ref_tags_make (taxobox_status_ref, ref taxobox_status_ref_sc_tag, ref taxobox_status_ref_open_tag);

				if (null == taxobox_status_ref_val)										// if empty or missing
					{
					if (taxobox_status_ref_is_empty)
						{
						taxobox = Regex.Replace (taxobox, taxobox_status_ref_empty_pattern, "$1" + taxobox_status_ref_open_tag + taxobox_status_ref + "</ref>$2");
						status_ref_added = true;
						}
					else																// here when |status_ref= is missing
						{
						taxobox = Regex.Replace (taxobox, taxobox_new_stat_sys_ref_pattern, "$1$2|status_ref=" + taxobox_status_ref_open_tag + taxobox_status_ref + "</ref>$2$3");
						status_ref_added = true;
						}
					}
				else
					{
					taxobox = Regex.Replace (taxobox, taxobox_status_ref_pattern, "$1" + taxobox_status_ref_open_tag + taxobox_status_ref + "</ref>");
					if ("named" == taxobox_status_ref_type)								// go rename all of the self-closed ref tags in article text and in the taxobox
						{
						article_text = Regex.Replace (article_text, sc_ref_tag_begin + taxobox_status_ref_name + sc_ref_tag_end, taxobox_status_ref_sc_tag);
						taxobox = Regex.Replace (taxobox, sc_ref_tag_begin + taxobox_status_ref_name + sc_ref_tag_end, taxobox_status_ref_sc_tag);
						}

					status_ref_updated = true;
					}
				}
			else
				status_ref_current = true;
			}
		else if ("named_sc" == taxobox_status_ref_type)
			{
			if (Regex.Match (article_text, ref_def_begin + taxobox_status_ref_name + ref_def_end).Success)
				{
				taxobox_status_ref_val = Regex.Match (article_text, ref_def_begin + taxobox_status_ref_name + ref_def_end).Groups[0].Value;
				taxobox_status_ref_val = unhide (taxobox_status_ref_val);
				taxobox_status_date = taxobox_status_date_get (taxobox_status_ref_val, taxobox_status_ref_name);
				taxobox_status_date_diff = taxobox_status_date_diff_get (taxobox_status_date);

				if (6 < taxobox_status_date_diff)
					{
					api_url = api_name_url + taxobox_species_name_val + api_token;			// build citation url from its various parts
					taxobox_status_ref = cite_iucn_get (api_url, null, article_title, null, taxobox_species_name_val);	// go build a {{cite IUCN}} template from the api

					if (null == taxobox_status_ref)
						return false;														// template build failed

					new_ref_tags_make (taxobox_status_ref, ref taxobox_status_ref_sc_tag, ref taxobox_status_ref_open_tag);

																							// replace original definition with new sc ref tag
					article_text = Regex.Replace (article_text, ref_def_begin + taxobox_status_ref_name + ref_def_end, taxobox_status_ref_sc_tag);

																							// replace original |status_ref= sc ref tag with new definition
					taxobox = Regex.Replace (taxobox, taxobox_status_sc_ref_pattern, "$1" + taxobox_status_ref_open_tag + taxobox_status_ref + "</ref>");

																							// rename original sc ref tags
					article_text = Regex.Replace (article_text, sc_ref_tag_begin + taxobox_status_ref_name + sc_ref_tag_end, taxobox_status_ref_sc_tag);
					taxobox = Regex.Replace (taxobox, sc_ref_tag_begin + taxobox_status_ref_name + sc_ref_tag_end, taxobox_status_ref_sc_tag);

					status_ref_updated = true;
					}
				}

			else
				error_log_add ("taxobox_update(): no definition for: " + code_nowiki (taxobox_status_ref_val));
			}
		else
			{
			error_log_add ("taxobox_update(): no " + code_nowiki ("|status_ref="));
			}
		}

	else																										// here when binomial is not recognized by iucn
		{
		if (null != taxobox_species_name_val)
			{
			taxobox_status_val = taxobox_status_get (taxobox);													// if either of these then add a maintenance category and ...
			taxobox_status_system_val = taxobox_system_get (taxobox);											// ... save unrecognized binomial for edit summary only when ...

			if ((((null != taxobox_status_val) && is_iucn_status (taxobox_status_val)) ||						// ... |status= has a value that is an IUCN status or
				((null != taxobox_status_system_val) && is_iucn_system (taxobox_status_system_val))) ||			// |status_system= has a value that is an IUCN system or
				((null == taxobox_status_val) && (null == taxobox_status_system_val)))							// both are missing or empty (example: Barlow's lark)
					{
					unrecognized_species_name = Uri.UnescapeDataString (taxobox_species_name_val);				// remove percent encoding
					string cat_plus_name = "[[Category:Taxobox binomials not recognized by IUCN]]" + " <!-- " + unrecognized_species_name + " -->";

					MatchCollection matches = Regex.Matches (article_text, @"__WL1NK_O__[Cc]ategory:.+__WL1NK_C__");	// find all of the categories

					if (0 != matches.Count)									// non-zero when categories found
						{
						int index = matches.Count - 1;						// make an indexer from Count and then replace last one with itself + our category
						article_text = Regex.Replace (article_text, matches[index].Value, matches[index].Value + '\n' + cat_plus_name);
						}
					else													// here when no categories; look for stub templates
						{
						matches = Regex.Matches (article_text, @"__0P3N__.+\-stub__CL0S3__");	// find all of the stub templates
						if (0 != matches.Count)								// non-zero when stub templates found
							article_text = Regex.Replace (article_text, matches[0].Value, cat_plus_name + '\x0A' + '\x0A' + matches[0].Value);
						else												// here when no categories and no stub templates
							article_text = article_text + '\x0A' + cat_plus_name;	// no cats and no stub templates, add to the end
						}

					// binomial may not be recognized for a global assessment but is recognized for a regional assessment;
					// this script cannot know which region so cannot use the regional form of the citation API call:
					//		/api/v3/species/citation/:name/region/:region_identifier?token='YOUR TOKEN'
					// binomial may be recognized in iucn search box (as a redirect-like name) but that is not available
					// to the API (and if it were probably shouldn't be used)
					}
			}
		}
	taxobox = unhide (taxobox);
	article_text = Regex.Replace (article_text, taxobox_template_pattern, taxobox_blank);	// install a blank so that we don't spend time evaluating the citation in |status_ref=
	return true;
	}


//---------------------------< N E W _ S E L F _ C L O S E D _ T A G S _ M A K E >----------------------------
//
// makes self-closed and normal <ref> tags for new |status_ref= {{cite IUCN}} reference using |access-date= from
// the {{cite IUCN}} template
//

private void new_ref_tags_make (string cite_iucn, ref string new_self_closed_tag, ref string taxobox_status_ref_open_tag)
	{
	string date = Regex.Match (cite_iucn, access_date).Groups[1].Value.Trim();		// date from new {{cite IUCN}} |access-date=
	new_self_closed_tag = @"<ref name=""iucn status " + date + @""" />";			// make a version to replace short-form ref tags that need to be renamed
	taxobox_status_ref_open_tag = @"<ref name=""iucn status " + date + @""">";		// make a version for |status_ref=
	}


//---------------------------< T A X O B O X _ S T A T U S _ G E T >------------------------------------------
//
// gets value assigned to {{taxobox}} or {{speciesbox}} |status= parameter; returns that value; status validation
// is done by calling function; returns null if |status= is missing or empty.
//

private string taxobox_status_get (string taxobox_template)
	{
	if (!Regex.Match (taxobox_template, taxobox_status_missing).Success || Regex.Match (taxobox_template, taxobox_status_empty).Success)
		return null;															// |status= is missing or empty

	return Regex.Match (taxobox_template, taxobox_status_value).Groups[2].Value.Trim();
	}


//---------------------------< I S _ I U C N _ S T A T U S >--------------------------------------------------
//
// return true if <status> is known IUCN category; false else
//

private bool is_iucn_status (string status)
	{
	if (null == status)
		return false;

	return Regex.Match (status, IS_IUCN_STATUS).Success;
	}


//---------------------------< T A X O B O X _ S T A T U S _ U P D A T E >------------------------------------
//
// updates, adds, or confirms |status= in taxobox using value from iucn API
//

private void taxobox_status_update (ref string taxobox, string api_status_val, string taxobox_status_val)
	{
	if (null == api_status_val)										// did api return species data with IUCN category?
		return;

	if (!Regex.Match (taxobox, taxobox_status_missing).Success)		// if |status= not in taxobox
		{
		taxobox = Regex.Replace (taxobox, taxobox_new_stat_sys_ref_pattern, "$1$2|status=" + api_status_val + "$2$3");
		status_added = true;
		}
	else if (api_status_val != taxobox_status_val)
		{
		taxobox = Regex.Replace (taxobox, taxobox_status_pattern, "$1" + api_status_val + "$2");
		iucn_status_updated_count++;
		}
	else															// here when <api_status_val> == <taxobox_status_val>
		iucn_status_confirmed_count++;								// bump the confirmed count and done
	}


//---------------------------< T A X O B O X _ S Y S T E M _ G E T >------------------------------------------
//
// gets value assigned to {{taxobox}} or {{speciesbox}} |status_system= parameter; returns that value; status_system
// validation is done by calling function; returns null if |status_system= is missing or empty.
//

private string taxobox_system_get (string taxobox_template)
	{
	if (!Regex.Match (taxobox_template, taxobox_system_missing).Success || Regex.Match (taxobox_template, taxobox_system_empty).Success)
		return null;															// |status= is missing or empty

	return Regex.Match (taxobox_template, taxobox_system_value).Groups[2].Value.Trim();
	}


//---------------------------< I S _ I U C N _ S Y S T E M >--------------------------------------------------
//
// return true if <system> is known IUCN category; false else
//

private bool is_iucn_system (string system)
	{
	if (null == system)
		return false;

	return Regex.Match (system, IS_IUCN_SYSTEM).Success;
	}


//---------------------------< T A X O B O X _ S Y S T E M _ U P D A T E >------------------------------------
//
// updates, adds, or confirms |status_system= in taxobox using value from iucn API
//

private void taxobox_system_update (ref string taxobox, string api_status_system_val, string taxobox_status_system_val)
	{
	if (null == api_status_system_val)								// did api return species data with IUCN category?
		return;

	if (!Regex.Match (taxobox, taxobox_system_missing).Success)		// if |status_system= not in taxobox
		{
		taxobox = Regex.Replace (taxobox, taxobox_new_stat_sys_ref_pattern, "$1$2|status_system=" + api_status_system_val + "$2$3");
		status_system_added = true;
		}

	else if (api_status_system_val != taxobox_status_system_val)
		{
		taxobox = Regex.Replace (taxobox, taxobox_system_pattern, "$1" + api_status_system_val + "$2");
		iucn_status_system_updated_count++;
		}
	}


//---------------------------< T A X O B O X _ S T A T U S _R E F _ G E T >-----------------------------------
//
// gets value assigned to {{taxobox}} or {{speciesbox}} |status_system= parameter; returns that value; ref tags,
// ref name, and reference text extracted by calling function
//

private string taxobox_status_ref_get (string taxobox, ref bool taxobox_status_ref_is_empty)
	{
	if (!Regex.Match (taxobox, taxobox_status_ref_missing).Success)
		return null;															// |status= is missing

	if (Regex.Match (taxobox, taxobox_status_ref_empty).Success)
		{
		taxobox_status_ref_is_empty = true;
		return null;															// |status= is empty
		}

	return Regex.Match (taxobox, taxobox_status_ref_value).Groups[2].Value.Trim();
	}


//---------------------------< T A X O B O X _ S T A T U S _ R E F _ T Y P E _ G E T >------------------------
//
// look at opening <ref> tag and return its type (order of evaluation is important here:
//		<ref> returns 'unnamed'
//		<ref ... name = .../>returns 'named_sc'
//		<ref ... name = ...> returns 'named'
// if none of these, or <taxobox_status_ref_val> is null, returns null
//

private string taxobox_status_ref_type_get (string taxobox_status_ref_val, ref string taxobox_status_ref_name)
	{
	if (null == taxobox_status_ref_val)
		return null;

	if (Regex.Match (taxobox_status_ref_val, ref_tag_unnamed_pattern).Success)
		return "unnamed";

	if (Regex.Match (taxobox_status_ref_val, ref_tag_named_sc_pattern).Success)		// order here important; named_sc test before named test
		{
		taxobox_status_ref_name = Regex.Match (taxobox_status_ref_val, ref_tag_named_sc_pattern).Groups[2].Value.Trim();
		return "named_sc";
		}

	if (Regex.Match (taxobox_status_ref_val, ref_tag_named_pattern).Success)		// order here important; named test after named_sc test
		{
		taxobox_status_ref_name = Regex.Match (taxobox_status_ref_val, ref_tag_named_pattern).Groups[2].Value.Trim();
		return "named";
		}

	return null;											// should never get here
	}


//---------------------------< T A X O B O X _ S T A T U S _ D A T E _ G E T >--------------------------------
//
// attempt to get date of last status update from ref tag (<ref name="iucn status 29 September 2021">) or from
// |access-date= value
//

private string taxobox_status_date_get (string taxobox_status_ref_val, string taxobox_status_ref_name)
	{
	if ((null != taxobox_status_ref_name) && Regex.Match (taxobox_status_ref_name, preferred_status_ref_tag_name).Success)
		return Regex.Match (taxobox_status_ref_name, preferred_status_ref_tag_name).Groups[1].Value.Trim();

	taxobox_status_ref_val = unhide (taxobox_status_ref_val);

	if (Regex.Match (taxobox_status_ref_val, access_date).Success)
		return Regex.Match (taxobox_status_ref_val, access_date).Groups[1].Value.Trim();	// date from |access-date=

	return null;
	}


//---------------------------< T A X O B O X _ S T A T U S _ D A T E _ D I F _ G E T >------------------------
//
// return the difference in months between today's date and a date from the |status_ref= <ref> tag or from the
// |status_ref= citation's |access-date=
//
// script will not update |status_ref= if date difference is less than 7 months
//

private int taxobox_status_date_diff_get (string date)
	{
	if (null == date)
		{
	//	error_log_add ("taxobox_status_date_diff_get(): nil date value; forcing update");	// not really an error
		return 100;											// any value greater than 6 forces citation update attempt
		}

	int		current_month = DateTimeOffset.Now.Month;
	int		current_year = DateTimeOffset.Now.Year;

	string	month = null;
	string	year = null;

	foreach(KeyValuePair<string, string> date_pattern in date_patterns)
		{
		Match match = Regex.Match (date, date_pattern.Value);
		if (match.Success)
			{
			if ("ymd" == date_pattern.Key)					// because year precedes month, Group[1] and Group[2] are ordered differently
				{
				month = match.Groups[2].Value.Trim().ToLower();
				year = match.Groups[1].Value.Trim();
				}
			else											// here when dmy or mdy
				{
				month = match.Groups[1].Value.Trim().ToLower();
				year = match.Groups[2].Value.Trim();
				}
			}
		}

	if ((null == month) || (null == year))
		{
		error_log_add ("taxobox_status_date_diff_get(): month and/or year null; forcing update");
		error_log_add ("year: " + year);
		error_log_add ("month: " + month);
		return 100;											// any value greater than 6 forces citation update attempt
		}

	if (months.ContainsKey (month))
		return ((current_year - Int32.Parse(year)) * 12) + current_month - months[month];
	else
		{
		error_log_add ("taxobox_status_date_diff_get(): month not recognized: " + month + "; forcing update");
		return 100;
		}
	}


//---------------------------< T A X O B O X _ S P E C I E S _ N A M E _ G E T >------------------------------
//
// attempts to get binomial from various parameters in {{taxobox}} or {{speciesbox}} and failing that the article
// title.
//
// taxobox: |binomial= -> |name= -> article title
// speciesbox: |taxon= -> |genus= + |species= -> |name= -> article title
//
// returns null when <name> is not binomial-like (two words); example [[Africanogyrus]]
//

private string taxobox_species_name_get (string taxobox, string article_title)
	{
	string	template_name = Regex.Match (taxobox, taxobox_template_pattern).Groups[2].Value.ToLower();			// capture is the template name (Taxobox, Speciesbox, etc)

	string	name = null;												// name of this species from various possible parameters in the taxobox template

	if ("taxobox" == template_name)
		{
		if (Regex.Match (taxobox, binomial_pattern).Success)
			name = Regex.Match (taxobox, binomial_pattern).Groups[1].Value.Trim();		// use |binomial=
		else if (Regex.Match (taxobox, name_pattern).Success)
			name = Regex.Match (taxobox, name_pattern).Groups[1].Value.Trim();			// fallback to |name=
		}
	else if ("speciesbox" == template_name)
		{
		if (Regex.Match (taxobox, taxon_pattern).Success)
			name = Regex.Match (taxobox, taxon_pattern).Groups[1].Value.Trim();			// use |taxon=
		else if (Regex.Match (taxobox, genus_pattern).Success && Regex.Match (taxobox, species_pattern).Success)
			name = Regex.Match (taxobox, genus_pattern).Groups[1].Value.Trim() + " " + Regex.Match (taxobox, species_pattern).Groups[1].Value.Trim();
		else if (Regex.Match (taxobox, name_pattern).Success)
			name = Regex.Match (taxobox, name_pattern).Groups[1].Value.Trim();			// fallback to |name=
		}

	if (null == name)													// when none of the above
		{
		name = article_title;											// TODO: don't use article title?
		error_log_add ("using article title");
		}

	name = species_name_cleanup (name);									// remove markup, extinction markers, disambiguation, etc

	if (!Regex.Match (Uri.UnescapeDataString (name), @"[A-Za-z]+ [A-Za-z]+").Success)	// does <name> look like a binomial?
		{
		error_log_add ("name not a binomial: " + name);
		return null;
		}

	return name;
	}


//---------------------------< T A X O N _ I D _ O L D _ F O R M _ U R L _ G E T >----------------------------
//
// loops through a series of old-form IUCN urls and returns the taxon id if the pattern matches; null else
//

private string taxon_id_from_old_form_url_get (string text)
	{
	foreach (string url_pattern in url_patterns)						// loop through a series of old-form url patterns
		{
		Match url_match = Regex.Match (text, url_pattern);
		if (url_match.Success)											// if found
			return url_match.Groups[1].Value.Trim();					// extract and return the taxon id
		}
	return null;
	}


//---------------------------< P L A I N _ T E X T _ T A X O N _ I D _ G E T >--------------------------------
//
// extract taxon id from IUCN page, doi, or url.  For plain-text citations, accept any form of iucn url when
// attempting to get the taxon id; prefer page -> doi -> url; returns taxon id if available, null else
//

private string plain_text_taxon_id_get (string plain_text)
	{
	if (Regex.Match (plain_text, plain_text_page_taxon_id).Success)		// get taxon id from page?
		return Regex.Match (plain_text, plain_text_page_taxon_id).Groups[1].Value;

	if (Regex.Match (plain_text, plain_text_doi_taxon_id).Success)		// get taxon id from doi?
		return Regex.Match (plain_text, plain_text_doi_taxon_id).Groups[1].Value;

	if (Regex.Match (plain_text, plain_text_taxon_id_url).Success)		// get taxon id from url?
		return Regex.Match (plain_text, plain_text_taxon_id_url).Groups[1].Value;

	return null;														// couldn't find taxon id; might not be iucn reference
	}


//---------------------------< I S _ P L A I N _ T E X T _ R E J E C T E D >----------------------------------
//
// evaluates <plain_text> looking for things that oughtn't to be there or that are not currently supported
// returns true when <plain_text> is rejected; null else
//

private bool is_plain_text_rejected (string plain_text)
	{
	if (Regex.Match (plain_text, @"\{\{\s*[Cc]it[ae]").Success)			// if 'plain text' has {{cit...}} template
		{
	//	error_log_add ("is_plain_text_rejected(): plain-text has cite template: " + plain_text);	// don't do this because it alarms on valid cite IUCN templates
		return true;													// skip this reference
		}

	if (Regex.Match (plain_text, amended_text).Success)
		{
		error_log_add ("is_plain_text_rejected(): plain-text has amended text");
		return true;													// because API doesn't yet identify amended assessment year
		}

	if (Regex.Match (plain_text, errata_text).Success)
		{
		error_log_add ("is_plain_text_rejected(): plain-text has errata text");
		return true;													// because API doesn't yet identify errata assessment year
		}

	return false;
	}


//---------------------------< S P E C I E S _ N A M E _ C L E A N U P >--------------------------------------
//
// removes stuff that isn't part of the binomial; returns name modified or not.
//

private string species_name_cleanup (string name)
	{
	name= Regex.Replace (name, "__4ng13_0__", "<");						// unhide html comments that might be part of <name>
	name= Regex.Replace (name, "__4ng13_C__", ">");

	foreach (string [] cleanup_pattern in cleanup_patterns)
		name = Regex.Replace (name, cleanup_pattern[0], cleanup_pattern[1]);

	name = name.Trim();													// and remove any leading/trailing whitespace
	name = Uri.EscapeDataString (name);									// percent encode uri reserved characters

	return name;
	}


//---------------------------< C I T E _ I U C N _ G E T >----------------------------------------------------
//
// creates {{cite IUCN}} template from api call.  Tries <first_url> first and if successful ignores <second_url>
// tries <second_url> else
//

private string cite_iucn_get (string first_url, string second_url, string ArticleTitle, string taxon_id, string species_name)
	{
	string citation_from_api = null;
	string raw_citation = null;

	if ((null == first_url) && (null == second_url))
		return null;

	var urls = new List<string>();
		urls.Add (first_url);
		urls.Add (second_url);

	foreach (string url in urls)
		{
		if (null != url)
			{
			citation_from_api = api_fetch (url, ArticleTitle);				// fetch citation from the IUCN API

			if (null == citation_from_api)
				return null;

			if (Regex.Match (citation_from_api, citation_from_api_pattern).Success)
				{
				raw_citation = Regex.Match (citation_from_api, citation_from_api_pattern).Groups[1].Value.Trim();
				break;
				}
			}
		}

	if (null == raw_citation)												// <raw_citation> must have a value
		{
		string text = "cite_iucn_get(): API did not return citation:";
		if (null != taxon_id)
			text = text + " id: " + taxon_id;
		if (null != species_name)
			text = text + " name: " + species_name;

		text = text + " " + code_nowiki (citation_from_api);

		error_log_add (text);
		api_no_cite_return_count++;
		return null;
		}

	string author_list = "";
	string date = "";
	string title = "";
	string volume = "";
	string page = "";
	string page_assessment = "";
	string doi = "";
	string doi_assessment = "";
	string access_date = "";

	Match parse = Regex.Match (raw_citation, parse_pattern);
	if (parse.Success)
		{
		author_list = author_names_get (parse.Groups[1].Value.Trim());
		date = @" |date=" + parse.Groups[2].Value.Trim();
		title = title_get (parse.Groups[3].Value.Trim());
		volume = @" |volume=" + parse.Groups[4].Value.Trim();
		page = @" |page=" + parse.Groups[5].Value.Trim();
		page_assessment = parse.Groups[6].Value.Trim();
		doi = @" |doi=" + parse.Groups[7].Value.Trim();
		doi_assessment = parse.Groups[8].Value.Trim();
		access_date = @" |access-date=" + parse.Groups[9].Value.Trim();
		}
	else
		{
		error_log_add ("cite_iucn_get(): parse failure: " + code_nowiki (citation_from_api));
		parse_fail_count++;
		return null;
		}

	if (page_assessment != doi_assessment)						// until errata date information available from the API
		{
		error_log_add ("cite_iucn_get(): doi/page mismatch: page assessment: " + code_nowiki (parse.Groups[5].Value.Trim()));
		page_doi_skip_count++;									// skip template when page- and doi-assessment ids are mismatched
		return null;
		}

	return @"{{cite IUCN" + author_list + date + title + volume + page + doi + access_date + @"}}";
	}


//---------------------------< A P I _ S P E C I E S _ D A T A _ G E T >--------------------------------------
//
// using taxon name, attempt to get species data from the IUCN API.
//

private bool api_species_data_get (string taxobox_species_name_val, ref string api_status_val, ref string api_status_system_val, string article_title)
	{
	if (null == taxobox_species_name_val)													// when taxobox_species_name_get() can't get a binomial-like name
		return false;

	string api_url = api_species_url + taxobox_species_name_val + api_token;				// build a url from its various parts (taxon name)
	string species_from_api = api_fetch (api_url, article_title);							// fetch species data from the IUCN API (taxon name)

	if (null == species_from_api)															// if the api call failed
		return false;																		// abandon

	if (Regex.Match (species_from_api, status_from_api_pattern).Success)					// update <api_status_val> from api return
		api_status_val = Regex.Match (species_from_api, status_from_api_pattern).Groups[1].Value;

	if (Regex.Match (species_from_api, status_system_from_api_pattern).Success)				// update <api_status_system_val> from api return
		{
		int year = Int32.Parse (Regex.Match (species_from_api, status_system_from_api_pattern).Groups[1].Value);	// convert to an integer
		api_status_system_val = ((2000 < year) ? "IUCN3.1" : "IUCN2.3");					// and then convert to the appropriate status system
		}

	if ((null == api_status_val) || (null == api_status_system_val))						// if either of these are null, declare an error
		{
		error_log_add ("api_species_data_get(): API did not return species data: " + code_nowiki (species_from_api));
		api_no_species_return_name_count++;
		return false;																		// and abandon
		}

	return true;
	}


//---------------------------< A P I _ F E T C H >------------------------------------------------------------
//
// calls the iucn api with <api_url>; returns raw data string on success; null else.  Bumps the api call counter
//
//

private string api_fetch (string api_url, string ArticleTitle)
	{
	if (0 < api_call_count)												// pause here for 3 seconds if <api_call_count> is greater than 0 (pause is skipped for the first api access)
		System.Threading.Thread.Sleep (3000);							// this prevents us from banging on the API too quickly

	api_call_count++;													// bump the call counter
	string string_from_api = null;

	try
		{
		// this WebRequest code courtesy of en.wiki editor User:DavidBrooks
		System.Net.HttpWebRequest webRequest = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(api_url);
		webRequest.UserAgent = "Wikipedia IUCN citation update experiment (https://en.wikipedia.org/wiki/User:Trappist_the_monk)";
		System.IO.Stream str = webRequest.GetResponse().GetResponseStream();
		string_from_api = new System.IO.StreamReader(str).ReadToEnd();
		}
	catch
		{
		error_log_add ("api_fetch(): Exception occurred reading: " + code_nowiki (api_url));
		api_fetch_fail_count++;
		return null;
		}

	return string_from_api;
	}


//---------------------------< A U T H O R _ N A M E S _ G E T >----------------------------------------------
//
// attempts to extract individual author names from iucn api citation.  Derived from [[Module:cite IUCN]] function
// make_cite_iucn()
//

private string author_names_get (string raw_author_list)
	{
	string collaboration = null;
	string pattern = @"(,\s+[A-Z]),";													// for when iucn forgets to include final dot
	raw_author_list = Regex.Replace (raw_author_list, pattern, "$1" + ".,");
	pattern = @"(\.[A-Z]),";															// for when iucn forgets to include final dot
	raw_author_list = Regex.Replace (raw_author_list, pattern, "$1" + ".,");

	pattern = @"\s\(([^\)]+)\)$";

	if (Regex.Match (raw_author_list, pattern).Success)
		{
		collaboration = Regex.Match (raw_author_list, pattern).Groups[1].Value.Trim();	// save the collaboration name
		raw_author_list = Regex.Replace (raw_author_list, pattern, "");					// remove collaboration from raw_author_list
		}

	raw_author_list = Regex.Replace (raw_author_list, @"\.?,?\s+&\s+", ".|");			// replace <opt. dot><opt. comma><space><ampersand><space> with <dot><pipe>
	raw_author_list = Regex.Replace (raw_author_list, @"\.,\s+", ".|");					// replace <dot><comma><space> with <dot><pipe>
	raw_author_list = Regex.Replace (raw_author_list, @"(\.[A-Z]),\s+", "$1.|");		// special case where iucn drops the dot after an initial

	string 		author_list = "";
	string[]	authors = Regex.Split (raw_author_list, @"\|");							// split the string on the <pipe>
	int			i = 1;

	foreach (string author in authors)
		{
		if (1 == i)
			author_list = author_list + " |author" + "=" + author;						// don't enumerate first author
		else
			author_list = author_list + " |author" + i + "=" + author;
		i++;
		}

	if (null != collaboration)
		author_list = author_list + " |collaboration=" + collaboration;

	return author_list;
	}


//---------------------------< T I T L E _ G E T >------------------------------------------------------------
//
// extracts title from iucn API citation; attempts to add markup so that it renders correctly
//

private string title_get (string raw_title)
	{
	string title = null;														// formatted title goes here
	string errata = "";															// errata year, if present, goes here; empty string for concatenation
	string amends = "";															// amends year, if present, goes here; empty string for concatenation
	string pattern = null;
	string replace = null;

	foreach (string[] search_and_replace in search_and_replaces)
		{
		pattern = search_and_replace[0];
		replace = search_and_replace[1];										// replace includes wiki markup for title
		if (Regex.Match (raw_title, pattern).Success)
			{
			title = Regex.Replace (raw_title, pattern, replace);
			break;
			}
		}

	if (null == title)
		{
		title = "''" + raw_title + "''";										// pattern not found apply italic markup to raw_title from API citation
	//	error_log_add ("title_get(): using raw title: " + raw_title);			// not really an error
		}

	pattern = errata_text;														// look for an errata string; as of 2021-10-01, errata string not available in API citation
	Match match = Regex.Match (title, pattern);
	if (match.Success)
		errata = " |errata=" + match.Groups[1].Value.Trim();

	pattern = amended_text;														// look for an amended string; as of 2021-10-01, amended string not available in API citation
	match = Regex.Match (title, pattern);
	if (match.Success)
		amends = " |amends=" + match.Groups[1].Value.Trim();

	return " |title=" + title + errata + amends;
	}


//---------------------------< H I D E >----------------------------------------------------------------------
//
// HIDE TEMPLATES: find templates that are not <dont_hide>; replace the opening {{ with __0P3N__, the closing }}
// with __CL0S3__, and internal | (pipes) with __P1P3__
//
// single curly braces in urls and other parameter values can confuse other regex in this code so replace {
// with __0CU!21Y__ and } with __CCU!21Y__
//

private string hide (string ArticleText, string dont_hide)
	{
	string pattern = @"\{\{(?!\s*" + dont_hide + @")[^\{\}]*\}\}";
	if (Regex.Match (ArticleText, pattern).Success)
		{
		ArticleText = Regex.Replace(ArticleText, pattern,
			delegate(Match match)
				{
				string	fixed_template;										// a hidden template is assembled here
				string	raw_template = match.Groups[0].Value;				// the whole template

				pattern = @"\{\{";											// hide the opening {{
				fixed_template = Regex.Replace (raw_template, pattern, "__0P3N__");

				pattern = @"\}\}";											// hide the closing }}
				fixed_template = Regex.Replace (fixed_template, pattern, "__CL0S3__");

				pattern = @"\|";											// and hide the pipes
				fixed_template = Regex.Replace (fixed_template, pattern, "__P1P3__");

				return fixed_template;
				});
		}

	pattern = @"(\<!\-{2,}\s*[^\>\|\}]*)\{\{(\s*" + dont_hide + @"[^\}]*)\}\}([^\>]*\-{2,}\>)";		// <!-- {{citx...}} -->
	ArticleText = Regex.Replace(ArticleText, pattern, "$1__0P3N__$2__CL0S3__$3");

	pattern = @"\{\|";														// open table markup
	ArticleText = Regex.Replace(ArticleText, pattern, "__0T4BL3__");

	pattern = @"\|\}(?!\})";												// close table markup
	ArticleText = Regex.Replace(ArticleText, pattern, "__CT4BL3__");

	pattern = @"([^\{])\{([^\{])";											// single opening curly brace
	ArticleText = Regex.Replace(ArticleText, pattern, "$1__0CU!21Y__$2");

	pattern = @"([^\}])\}([^\}])";											// single closing curly brace
	ArticleText = Regex.Replace(ArticleText, pattern, "$1__CCU!21Y__$2");

	pattern = @"\[\[(?![Ff]ile|[Ii]mage)([^\|\]]+)\|([^\]]+)\]\]";			// HIDE complex wikilinks: [[article title|label]] to __WL1NK_O__article title__P1P3__label__WL1NK_C__
	ArticleText = Regex.Replace(ArticleText, pattern, "__WL1NK_O__$1__P1P3__$2__WL1NK_C__");	// [[File: with wikilinks inside can be confusing

	pattern = @"\[\[([^\]]+)\]\]";											// HIDE simple wikilinks: [[article title]] to __WL1NK_O__article title__WL1NK_C__
	ArticleText = Regex.Replace(ArticleText, pattern, "__WL1NK_O__$1__WL1NK_C__");

	return ArticleText;
	}


//---------------------------< U N H I D E >------------------------------------------------------------------
//
// UNHIDE TEMPLATES: find templates and wikilinks that are hidden; replace the 'hide' keywords with the
// appropriate wiki markup
//

private string unhide (string ArticleText)
	{
	ArticleText = Regex.Replace(ArticleText, @"__WL1NK_O__", "[[");		// UNHIDE: replace __WL1NK_O__ with [[
	ArticleText = Regex.Replace(ArticleText, @"__WL1NK_C__", "]]");		// UNHIDE: replace __WL1NK_C__ with ]]
	ArticleText = Regex.Replace(ArticleText, @"__P1P3__", "|");			// UNHIDE: replace __P1P3__ with |

	ArticleText = Regex.Replace(ArticleText, @"__0T4BL3__", "{|");		// UNHIDE: replace __0T4BL3__ with {|
	ArticleText = Regex.Replace(ArticleText, @"__CT4BL3__", "|}");		// UNHIDE: replace __CT4BL3__ with |}

	ArticleText = Regex.Replace(ArticleText, @"__0CU!21Y__", "{");		// UNHIDE: replace __0CU!21Y__ with {
	ArticleText = Regex.Replace(ArticleText, @"__CCU!21Y__", "}");		// UNHIDE: replace __CCU!21Y__ with }

	ArticleText = Regex.Replace(ArticleText, @"__0P3N__", "{{");		// UNHIDE: replace __0P3N__ with {{
	ArticleText = Regex.Replace(ArticleText, @"__CL0S3__", "}}");		// UNHIDE: replace __CL0S3__ with }}

	return ArticleText;
	}


//---------------------------< S U M M A R Y _ C O N C A T >--------------------------------------------------
//
// concatenates text onto an existing edit summary string, limiting the string to a length of no more than 347
// characters.  When <summary> appended with <text> would be longer than the allowed 347 character limit, this
// function replaces <text> with an ellipsis.  Once an ellipsis is added, no more <text> can be added to <summary>
//

private string summary_concat (string summary, string text)
	{
	if (0 <= summary.IndexOf ("..."))									// if ellipsis already present in <summary>, abandon
		return summary;

	if (347 >= (summary.Length + text.Length + 3))						// if adding <text> to summary will overrun the 347 char limit (+ 3 to make sure we can add ellipsis if necessary)
		return summary + text;											// append <text> to <summary> and done

	return summary + "...";												// append ellipsis instead
	}


//---------------------------< C O D E _ N O W I K I >--------------------------------------------------------
//
// wraps 'text' in <code><nowiki>text</nowiki></code> tags for error log
//

private string code_nowiki (string text)
	{
	return "<code><nowiki>" + text + "</nowiki></code>";
	}


//---------------------------< E R R O R _ L O G _ A D D >----------------------------------------------------
//
// adds an error message to the error log list.  Probably superfluous.
//

private void error_log_add (string message)
	{
	error_log_list.Add (message);
	}


//---------------------------< L O G _ E R R O R S >----------------------------------------------------------
//
// writes the content of the error log list to the log file, prettified with wiki markup.
//

private void log_errors (string article_title, List<string> error_log_list)
	{
	System.IO.StreamWriter sw;
	string	time = DateTimeOffset.Now.ToString("u").Substring (11, 9);
	string	date = DateTimeOffset.Now.ToString("u").Substring (0, 10);

	string	log_file = @"Z:\Wikipedia\AWB\Monkbot_tasks\Monkbot_task_19_cite_iucn_update\logs\" + date + ".txt";

	int		seconds = DateTimeOffset.Now.Second;
	int		minutes = DateTimeOffset.Now.Minute;
	int		hours = DateTimeOffset.Now.Hour;


	sw = System.IO.File.AppendText (log_file);
	sw.WriteLine ("*[[" + article_title + "]] (" + time + "):");

	foreach (string list_item in error_log_list)
		sw.WriteLine ("*:" + list_item);

	error_log_list.Clear();

	sw.Close();
	}


//---------------------------< C O U N T E D _ R E P L A C E >------------------------------------------------
//
// common function to replace <pattern> with <replace> and bump <count> until no more <pattern>
//

private string counted_replace (string template, string pattern, string replace, ref int count)
	{
	Regex rgx = new Regex (pattern);											// make a new regex from <pattern>

	while (Regex.Match (template, pattern).Success)								// look for <pattern> in <template>
		{
		template = rgx.Replace (template, replace, 1);							// replace one copy of <pattern> with <replace>
		count++;																// bump the counter
		}

	return template;
	}


//===========================<< S T A T I C   D A T A >>======================================================

static bool		status_added = false;					// set to true when |status= created in taxobox

static int		plain_text_modified_count = 0;			// number of plain-text citations that were modified from the iucn api
static int		plain_text_count = 0;					// total number of plain-text iucn references

static int		api_call_count = 0;						// number of api calls made; this value not reported in edit summary
static int		api_fetch_fail_count = 0;				// number of api fetches that failed
static int		api_no_cite_return_count = 0;			// number of times that the api returned a non-citation value like: {"value":"0","species":"202965"}
static int		parse_fail_count = 0;					// number of times that we couldn't parse the api return
static int		page_doi_skip_count = 0;				// number of templates or plain-text references skipped because page and doi assessment ID mismatch (could be errata but since no errata date ...)
static int		api_no_species_return_name_count = 0;	// number of times that the api returned a non-species value (species name)
static int		api_no_species_return_id_count = 0;		// number of times that the api returned a non-species value (species id for {{IUCN status}})
static int		iucn_status_updated_count = 0;			// number of times that we updated the iucn status in taxobox-like templates
static int		iucn_status_confirmed_count = 0;		// number of times that we confirmed the iucn status in taxobox-like templates
static int		iucn_status_system_updated_count = 0;	// number of times that we updated the iucn status system in taxobox-like templates

static string	taxobox_blank = null;					// gets blank taxobox as flag
static bool		status_ref_added = false;				// set to true when |status_ref= created
static bool		status_system_added = false;			// set to true when |status_system created
static bool		status_ref_updated = false;				// set to true when |status_ref= updated
static bool		status_ref_current = false;				// set to true when |status_ref= less than 6 months old
static int		duplicates_removed_count = 0;			// number of duplicate status references removed


static string	sc_ref_tag_begin = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?";	// these for taxobox |status_ref= handling
static string	sc_ref_tag_end = @"""?\s*/\>";

static string	ref_def_begin = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?";		// these for taxobox |status_ref= <ref name=... /> handling to locate the matching definition
static string	ref_def_end = @"""?\s*\>[^\<]*\</[Rr][Ee][Ff]\>";

static string	reflist_cleanup = @"(\{\{\s*[Rr]eflist[^\}]*\|\s*refs\s*=[^\}]*)\<\s*[Rr][Ee][Ff][^\>]*/\>";

static string	hide_non_ref_tag_pattern = @"\<((?!/[Rr][Ee][Ff]|[Rr][Ee][Ff])[^\>]*)\>";
static string	angle_open = "__4ng13_0__";
static string	angle_close = "__4ng13_C__";
static string	hide_non_ref_replace_val = angle_open + "$1" + angle_close;

static int		iucn_template_count = 0;				// total number of cite IUCN templates
static int		other_template_count = 0;				// total number of cite journal/web templates


//---------------------------< A P I >------------------------------------------------------------------------

static string	api_species_url = "http://apiv3.iucnredlist.org/api/v3/species/";	// for fetching species data from the api by name
static string	api_species_id_url = api_species_url + "id/";						// for fetching species data from the api by taxon id (for {{IUCN status}})
static string	api_id_url = api_species_url + "citation/id/";						// for fetching citation data from the api using taxon id
static string	api_name_url = api_species_url + "citation/";						// for fetching citation data from the api using binomial

static string	iucn_api_token_file = @"Z:\Wikipedia\AWB\Monkbot_tasks\Monkbot_task_19_cite_iucn_update\iucn_api_token";	// token required to be private; stored locally here
static string	api_token = null;													// stored at iucn_api_token_file


//---------------------------< C I T E   I U C N >------------------------------------------------------------

	static string	IS_CITE_IUCN = @"(?:[Cc]ite iucn|[Cc]ite IUCN)";
	static string	iucn_template_pattern = @"\{\{\s*" + IS_CITE_IUCN + @"[^\}]+\}\}";				// basic cite IUCN template pattern
	static string	iucn_title = @"\|\s*title\s*=([^\|\}]*)";										// everything in cite IUCN |title= for api calls

	static string[] url_patterns = new string[]
		{
		@"https?://www\.iucnredlist\.org/details/(\d+)/\b(?:all|full)",
		@"https?://www\.iucnredlist\.org/details/full/(\d+)/\d+",
		@"https?://www\.iucnredlist\.org/details/(\d+)/\d+",
		@"https?://www\.iucnredlist\.org/details/(\d+)/?",
		@"https?://www\.iucnredlist\.org/details/summary/(\d+)",
		@"https?://www\.iucnredlist\.org/search/details\.php/(\d+)/(?:all|summ)",
		@"https?://oldredlist\.iucnredlist.org/details/(\d+)/\d+",
		};

	static string	ref_param_empty = @"\|\s*ref\s*=\s*([\|\}])";
	static string	ref_param_not_empty = @"\|\s*ref\s*=\s*([^\|\}]+)";


//---------------------------< C I T E   J O U R N A L / W E B >----------------------------------------------

	static string	IS_CITE_OTHER = @"(?:[Cc]ite journal|[Cc]ite web)";		// TODO: expand this to include more redirects?
	static string	other_template_pattern = @"\{\{\s*" + IS_CITE_OTHER + @"[^\}]+\}\}";				// basic cite IUCN template pattern



//---------------------------< N E W   C I T E   I U C N >----------------------------------------------------
//
// parse_pattern doesn't work for citations like this (from [[Cantleya]]) because of the 'extra' year ahead of
// the binomial:
//		Asian Regional Workshop (Conservation & Sustainable Management of Trees, Viet Nam, August 1996) 1998. Cantleya corniculata. The IUCN Red List of Threatened Species 1998: e.T33197A9760751. https://dx.doi.org/10.2305/IUCN.UK.1998.RLTS.T33197A9760751.en .Downloaded on 1 October 2021
//
// Haven't seen enough of these to attempt a second parse pattern
//

//static string	citation_from_api_pattern = @"\[\{""citation"":""([^""]*)""\}\]";
static string	citation_from_api_pattern = @"\[\{""citation"":""([^\}]*)""\}\]";
static string	parse_pattern = @"(^\D+)(\d{4})\.(\D+)\. The IUCN Red List of Threatened Species (\d{4}): (e\.T\d+A(\d+))\.\D+(10\.2305\/IUCN\.UK\.[\d\-]+\.RLTS\.T\d+A(\d+)\S+)\D+(\d{1,2} [A-Za-z]+ \d{4})";

static string[][] search_and_replaces =
	{
	new string[] {@"(.+?)\sssp\.\s+(.+?)\s(\([^\)]+\))$",		@"''$1'' ssp. ''$2'' $3"},		// binomen ssp. subspecies (zoology) with errata or amended text
	new string[] {@"(.+?)\sssp\.\s+(.+)",						@"''$1'' ssp. ''$2''"},			// binomen ssp. subspecies (zoology)
	new string[] {@"(.+?)\ssubsp\.\s+(.+?)\s(\([^\)]+\))$",		@"''$1'' subsp. ''$2'' $3"},	// binomen subsp. subspecies (botany) with errata or amended text
	new string[] {@"(.+?)\ssubsp\.\s+(.+)",						@"''$1'' subsp. ''$2''"},		// binomen subsp. subspecies (botany)
	new string[] {@"(.+?)\svar\.\s+(.+?)\s+(\([^\)]+\))$",		@"''$1'' var. ''$2'' $3"},		// binomen var. variety (botany) with errata or amended text
	new string[] {@"(.+?)\svar\.\s+(.+)",						@"''$1'' var. ''$2''"},			// binomen var. variety (botany)
	new string[] {@"(.+?)\ssubvar\.\s+(.+?)\s(\([^\)]+\))$",	@"''$1'' subvar. ''$2'' $3"},	// binomen subvar. subvariety (botany) with errata or amended text
	new string[] {@"(.+?)\ssubvar\.\s+(.+)",					@"''$1'' subvar. ''$2''"},		// binomen subvar. subvariety (botany)
	new string[] {@"(.+?)\s*(\([^\)]+\))$",						@"''$1'' $2"}					// binomen with errata or amended text
	};

static string	errata_text = @"\(errata version published in (\d{4})\)";
static string	amended_text = @"\(amended version of (\d{4}) assessment\)";


//---------------------------< T A X O B O X >----------------------------------------------------------------

static string	HIDE_ALL_BUT_TAXOBOX = @"(?:[Tt]axobox\s*\||[Ss]peciesbox\s*\|)";							// this to prevent confusion with {{Taxobox authority}} when hiding
static string	IS_TAXOBOX = @"(?:[Tt]axobox|[Ss]peciesbox)";												// for hiding all non-taxobox-like templates
static string	taxobox_template_pattern = @"(\{\{\s*(" + IS_TAXOBOX + @"))[^\}]+(\}\})";					// basic taxobox-like template pattern; TODO: {{subspeciesbox}}?
static string	taxobox_blank_pattern = @"\{\{\s*" + IS_TAXOBOX + @"\}\}";

static string	taxobox_new_stat_sys_ref_pattern = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]+?)(\s*)(\}\})";		// used to create new |status=, |status_system=, and |status_ref= params in taxobox
static string	taxobox_status_ref_pattern = @"(\|\s*status_ref\s*=\s*)(\<ref[^\>]*\>)[^\<]*(\</ref\>)";	// used to replace |status_ref= param in taxobox
static string	taxobox_status_ref_empty_pattern = @"(\|\s*status_ref\s*=[ \t]*)([\r\n]*[\|\}])";			// used to add reference to |status_ref= param in taxobox

static string	taxobox_status_sc_ref_pattern = @"(\|\s*status_ref\s*=\s*)(\<[Rr][Ee][Ff][^\>]+/\>)";		// used to replace |status_ref= param in taxobox

static string	taxobox_status_ref = null;																	// the 'new' value for |status_ref
static string	taxobox_status_ref_open_tag = null;															// it matching ref open tag
static string	taxobox_status_ref_sc_tag = null;															// and its matching self-closed tag

static string	stray_dot = @"(\|\s*status_ref\s*=\s*)\.";													// delete stray dot; because I found one such (Astroblepus pholeter)
static string	stray_splat = @"(\|\s*status_ref\s*=\s*)\*";												// delete stray spat; because I found one such (Gray short-tailed bat)
static string	stray_equal = @"(\|\s*status_ref\s*=\s*)=";													// delete stray equal; because I found one such (Cyprinus hieni)
static string	stray_nbsp = @"(\|\s*status_ref\s*=\s*)&nbsp;";												// delete stray &nbsp; because I found one such (Euconocephalus remotus)
static string	html_comment = @"(\|\s*status_ref\s*=[^\|\}]*)\<!\-\-[^\>]*\-\-\>";							// and html comments
static string	unrecognized_species_name = null;															// gets taxobox species name that IUCN doesn't recognize


//---------------------------< T A X O B O X _ S T A T U S >--------------------------------------------------

static string	IS_IUCN_STATUS = @"(\b(?:LC|LR/lc|NT|LR/nt|LR/cd|VU|EN|CR|PE|PEW|EW|EX|DD|NE)\b)";			// also used with {{IUCN status}}

static string	taxobox_status_missing = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status\s*=";
static string	taxobox_status_empty = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status\s*=\s*([\|\}])";
static string	taxobox_status_value = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status\s*=\s*([^\|\}]+)";

static string	taxobox_status_pattern = @"(\|\s*status\s*=\s*)[^\|\}]*?(\s*[\|\}])";

static string	status_from_api_pattern = @"""category"":""([^""]+)""";				// for |status=


//---------------------------< T A X O B O X _ S Y S T E M >--------------------------------------------------

static string	IS_IUCN_SYSTEM = @"(\b(?:IUCN2.3|IUCN3.1)\b)";

static string	taxobox_system_missing = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_system\s*=";
static string	taxobox_system_empty = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_system\s*=\s*([\|\}])";
static string	taxobox_system_value = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_system\s*=\s*([^\|\}]+)";

static string	taxobox_system_pattern = @"(\|\s*status_system\s*=\s*)[^\|\}]*([^\|\}])";

static string	status_system_from_api_pattern = @"""assessment_date"":""(\d+)";	// for |status_system=


//---------------------------< T A X O B O X _ S T A T U S _ R E F >------------------------------------------

static string	taxobox_status_ref_missing = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_ref\s*=";
static string	taxobox_status_ref_empty = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_ref\s*=\s*([\|\}])";
static string	taxobox_status_ref_value = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_ref\s*=\s*([^\|\}]+)";

static string	ref_tag_named_pattern = @"(\<[Rr][Ee][Ff][^\>]*name\s*=\s*""?([^""\>]*)""?\s*\>)";
static string	ref_tag_named_sc_pattern = @"(\<[Rr][Ee][Ff][^\>]*name\s*=\s*""?([^""/]*)""?\s*/\s*\>)";
static string	ref_tag_unnamed_pattern = @"(\<[Rr][Ee][Ff]\>)";


//---------------------------< T A X O B O X _ S P E C I E S _ N A M E >--------------------------------------

static string	binomial_pattern = @"\|\s*binomial\s*=\s*([^\|\}]*)";				// taxobox

static string	taxon_pattern = @"\|\s*taxon\s*=\s*([^\|\}]*)";						// speciesbox
static string	genus_pattern = @"\|\s*genus\s*=\s*([^\|\}]*)";						// these two combined to make binomial name
static string	species_pattern = @"\|\s*species\s*=\s*([^\|\}]*)";

static string	name_pattern = @"\|\s*name\s*=\s*([^\|\}]*)";						// taxobox and speciesbox


//---------------------------< D A T E S >--------------------------------------------------------------------

static Dictionary<string, string> date_patterns = new Dictionary<string, string>()
	{
	{"dmy", @"\d{1,2}\s+([JFMASOND][a-z]+)\s+(\d{4})"},		// dmy
	{"mdy", @"([JFMASOND][a-z]+)\s+\d{1,2}\s*,\s+(\d{4})"},	// mdy
	{"ymd", @"(\d{4})\-(\d{2})\-\d{2}"}						// ymd
	};

static string	preferred_status_ref_tag_name = @"iucn status (\d{1,2}\s+([JFMASOND][a-z]+)\s+(\d{4}))";
static string	access_date = @"\|access\-?date=([^\|\}]+)";

static Dictionary<string, int> months = new Dictionary<string, int>()
	{
	{"january", 1},										// these for dmy and mdy
	{"february", 2},
	{"march", 3},
	{"april", 4},
	{"may", 5},
	{"june", 6},
	{"july", 7},
	{"august", 8},
	{"september", 9},
	{"october", 10},
	{"november", 11},
	{"december", 12},
	{"jan", 1},											// these for dmy and mdy
	{"feb", 2},
	{"mar", 3},
	{"apr", 4},
//	{"may", 5},											// same as whole month name; can't have two with the same key
	{"jun", 6},
	{"jul", 7},
	{"aug", 8},
	{"sep", 9},
	{"oct", 10},
	{"nov", 11},
	{"dec", 12},
	{"01", 1},											// these for ymd
	{"02", 2},
	{"03", 3},
	{"04", 4},
	{"05", 5},
	{"06", 6},
	{"07", 7},
	{"08", 8},
	{"09", 9},
	{"10", 10},
	{"11", 11},
	{"12", 12},
	};


//--------------------------- R E M O V E   D U P L I C A T E   S T A T U S   R E F >-------------------------

static string[]	symbols = new string[]
	{
	@"\{",
	@"\(",
	@"\|",
	@"\.",
	@"\-",
	@"\)",
	@"\}",
	};

static string	ref_open_tag_unnamed = @"\<[Rr][Ee][Ff]\>";
static string	ref_open_tag_named = @"\<[Rr][Ee][Ff][^\>]*\>";
static string	ref_close_tag = @"\</[Rr][Ee][Ff]>";
static string	bib_open_ul = @"[\r\n]+\*\s*";
static string	bib_close_ul = @"([\r\n]+)";


//---------------------------< S P E C I E S _ N A M E _ C L E A N U P >--------------------------------------
//
// these things must be removed from binomial before calling the api with the binomial
//

static string[][] cleanup_patterns =
	{
	new string[] 	{ref_open_tag_named + @"[^\<]*" + ref_close_tag,	""},	// references; [[Lampadioteuthis]] caused api fetch exception
	new string[] 	{@"\<[Rr][Ee][Ff][^\>]+/\>",	""},						// self-closed references; [[Sand cat]]
	new string[]	{@"\<!\-\-[^\>]*\-\-\>",		""},						// html comment
	new string[] 	{@"[\.;:]+$",		""},									// trailing punctuation
	new string[] 	{"'''(.+)'''",		"$1"},									// bold wiki markup
	new string[] 	{"''(.+)''$",		"$1"},									// italic wiki markup
	new string[] 	{@"""",				""},									// double quote marks
	new string[] 	{"†",				""},									// extinction markers
	new string[] 	{@"\[\[",			""},									// opening wikilink markup
	new string[] 	{@"\]\]",			""},									// closing wikilink markup
	new string[] 	{@"\s*\([^\)]+\)",	""},									// disambiguation
	new string[] 	{@"[\.;:]+$",		""},									// trailing punctuation (again)
	new string[] 	{@"\<nowiki/\>",	""},									// self-closed <nowiki/> tag
	new string[] 	{@"\<nowiki\>",		""},									// opening <nowiki> tag
	new string[] 	{@"\</nowiki\>",	""},									// closing </nowiki> tag
	};


//----------------------------------------< P L A I N _ T E X T >---------------------------------------------
//
// for plaintext references wrapped in <ref>...</ref> tags or in unordered markup (bibliography); must have a
// recognizable page identifier or doi or a url from which a taxon id can be extracted
//

static string	plain_text_ref_pattern = @"(\< *ref[^\>]*\>)([^\<]*)(\</ref>)";								// <ref>anything</ref> ref tags and reference are captured
static string	plain_text_bib_pattern = @"([\r\n]+\*)([^\r\n]*iucnredlist\.org[^\r\n]*)([\r\n]+)"; 		// some sort of iucn ref in unordered list

static string	plain_text_page_taxon_id = @"\be\.T(\d+)A\d+";												// get taxon id from page
static string	plain_text_doi_taxon_id = @"\bRLTS\.T(\d+)A\d+";											// get taxon id from doi
static string	plain_text_taxon_id_url = @"https?://(?:www|oldredlist)\.iucnredlist\.org/\S+?/(\d+)\S+";	// get taxon id from url


//---------------------------< I U C N   S T A T U S >--------------------------------------------------------

static string	iucn_status_template_pattern = @"(\{\{\s*IUCN status[^\}]+\})";
static string	iucn_status_lead = @"(\{\{\s*IUCN status\s*\|\s*)";
static string	iucn_status_status = iucn_status_lead + IS_IUCN_STATUS;
static string	iucn_status_id = @"(\{\{\s*IUCN status\s*\|[^\|]+\|\s*)(\d+)";


// Monkbot_task_19_cite_iucn_update.cs