Changes

Jump to navigation Jump to search
sync from sandbox;
Line 250: Line 250:  
--[[--------------------------< N O R M A L I Z E _ L C C N >--------------------------------------------------
 
--[[--------------------------< N O R M A L I Z E _ L C C N >--------------------------------------------------
   −
LCCN normalization (http://www.loc.gov/marc/lccn-namespace.html#normalization)
+
LCCN normalization (https://www.loc.gov/marc/lccn-namespace.html#normalization)
 
1. Remove all blanks.
 
1. Remove all blanks.
 
2. If there is a forward slash (/) in the string, remove it, and remove all characters to the right of the forward slash.
 
2. If there is a forward slash (/) in the string, remove it, and remove all characters to the right of the forward slash.
Line 287: Line 287:  
--[[--------------------------< A R X I V >--------------------------------------------------------------------
 
--[[--------------------------< A R X I V >--------------------------------------------------------------------
   −
See: http://arxiv.org/help/arxiv_identifier
+
See: https://arxiv.org/help/arxiv_identifier
    
format and error check arXiv identifier.  There are three valid forms of the identifier:
 
format and error check arXiv identifier.  There are three valid forms of the identifier:
Line 381: Line 381:  
Validates (sort of) and formats a bibcode ID.
 
Validates (sort of) and formats a bibcode ID.
   −
Format for bibcodes is specified here: http://adsabs.harvard.edu/abs_doc/help_pages/data.html#bibcodes
+
Format for bibcodes is specified here: https://adsabs.harvard.edu/abs_doc/help_pages/data.html#bibcodes
    
But, this: 2015arXiv151206696F is apparently valid so apparently, the only things that really matter are length, 19 characters
 
But, this: 2015arXiv151206696F is apparently valid so apparently, the only things that really matter are length, 19 characters
Line 527: Line 527:  
and terminal punctuation may not be technically correct but it appears, that in practice these characters are rarely
 
and terminal punctuation may not be technically correct but it appears, that in practice these characters are rarely
 
if ever used in DOI names.
 
if ever used in DOI names.
 +
 +
https://www.doi.org/doi_handbook/2_Numbering.html -- 2.2 Syntax of a DOI name
 +
https://www.doi.org/doi_handbook/2_Numbering.html#2.2.2 -- 2.2.2 DOI prefix
    
]]
 
]]
Line 573: Line 576:  
'^[^1-9]%d%d%d$', -- 4 digits without subcode (0xxx); accepts: 1000–9999
 
'^[^1-9]%d%d%d$', -- 4 digits without subcode (0xxx); accepts: 1000–9999
 
'^%d%d%d%d%d%d+', -- 6 or more digits
 
'^%d%d%d%d%d%d+', -- 6 or more digits
'^%d%d?%d?$', -- less than 4 digits without subcode (with subcode is legitimate)
+
'^%d%d?%d?$', -- less than 4 digits without subcode (3 digits with subcode is legitimate)
 +
'^%d%d?%.[%d%.]+', -- 1 or 2 digits with subcode
 
'^5555$', -- test registrant will never resolve
 
'^5555$', -- test registrant will never resolve
 
'[^%d%.]', -- any character that isn't a digit or a dot
 
'[^%d%.]', -- any character that isn't a digit or a dot
Line 621: Line 625:  
if ever used in HDLs.
 
if ever used in HDLs.
   −
Query string parameters are named here: http://www.handle.net/proxy_servlet.html.  query strings are not displayed
+
Query string parameters are named here: https://www.handle.net/proxy_servlet.html.  query strings are not displayed
 
but since '?' is an allowed character in an HDL, '?' followed by one of the query parameters is the only way we
 
but since '?' is an allowed character in an HDL, '?' followed by one of the query parameters is the only way we
 
have to detect the query string so that it isn't URL-encoded with the rest of the identifier.
 
have to detect the query string so that it isn't URL-encoded with the rest of the identifier.
Line 631: Line 635:  
local access = options.access;
 
local access = options.access;
 
local handler = options.handler;
 
local handler = options.handler;
local query_params = { -- list of known query parameters from http://www.handle.net/proxy_servlet.html
+
local query_params = { -- list of known query parameters from https://www.handle.net/proxy_servlet.html
 
'noredirect',
 
'noredirect',
 
'ignore_aliases',
 
'ignore_aliases',
Line 800: Line 804:     
Determines whether an ISMN string is valid.  Similar to ISBN-13, ISMN is 13 digits beginning 979-0-... and uses the
 
Determines whether an ISMN string is valid.  Similar to ISBN-13, ISMN is 13 digits beginning 979-0-... and uses the
same check digit calculations.  See http://www.ismn-international.org/download/Web_ISMN_Users_Manual_2008-6.pdf
+
same check digit calculations.  See https://www.ismn-international.org/download/Web_ISMN_Users_Manual_2008-6.pdf
 
section 2, pages 9–12.
 
section 2, pages 9–12.
   Line 849: Line 853:  
like this:
 
like this:
   −
|issn=0819 4327 gives: [http://www.worldcat.org/issn/0819 4327 0819 4327] -- can't have spaces in an external link
+
|issn=0819 4327 gives: [https://www.worldcat.org/issn/0819 4327 0819 4327] -- can't have spaces in an external link
 
 
 
This code now prevents that by inserting a hyphen at the ISSN midpoint.  It also validates the ISSN for length
 
This code now prevents that by inserting a hyphen at the ISSN midpoint.  It also validates the ISSN for length
Line 953: Line 957:  
Format LCCN link and do simple error checking.  LCCN is a character string 8-12 characters long. The length of
 
Format LCCN link and do simple error checking.  LCCN is a character string 8-12 characters long. The length of
 
the LCCN dictates the character type of the first 1-3 characters; the rightmost eight are always digits.
 
the LCCN dictates the character type of the first 1-3 characters; the rightmost eight are always digits.
http://info-uri.info/registry/OAIHandler?verb=GetRecord&metadataPrefix=reg&identifier=info:lccn/
+
https://oclc-research.github.io/infoURI-Frozen/info-uri.info/info:lccn/reg.html
    
length = 8 then all digits
 
length = 8 then all digits

Navigation menu