Skip to content

Commit

Permalink
Transform Easting/Northing coordinates to Lat/Long
Browse files Browse the repository at this point in the history
I've updated the `Gias::CsvTransformer` to convert the Easting/Northing
values provided by GIAS into Latitude/Longitude coordinates that can be
used for location-based search.

Previously we were using a geocoding service to get the
Latitude/Longitude coordinates for schools based on their address.
However this requires N+1 external API calls (one for each school) which
is both slow and costly.

Turns out it's much more efficient to just make use of the geolocation
data GIAS already provides. It's a bit fiddly to perform the translation
because it requires an external library called [PROJ][1].  However once
that's installed it's pretty straightforward.

I would have liked to have used the gem [proj4rb][2] which is a Ruby
binding for PROJ. However, for [unexplained reasons][3], I couldn't
get that gem to work correctly. So instead I'm calling out to a command
line app that PROJ provides called [cs2cs][4], which converts coordinates
from one 'coordinate reference system' to another.

In my testing, this performs surprisingly well. Even though we need to
call out to an external process, it will easily convert the coordinates
of 20,000 schools in ~200ms. Not too shabby!

[1]: https://proj.org
[2]: https://github.com/cfis/proj4rb
[3]: cfis/proj4rb#23
[4]: https://proj.org/en/9.4/apps/cs2cs.html
  • Loading branch information
ollietreend committed Apr 11, 2024
1 parent 9c7dcf9 commit c4fc0e9
Show file tree
Hide file tree
Showing 7 changed files with 109 additions and 16 deletions.
2 changes: 2 additions & 0 deletions app/services/gias/csv_importer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ def call
send_provision: school["TypeOfResourcedProvision (name)"].presence,
rating: school["OfstedRating (name)"].presence,
last_inspection_date: school["OfstedLastInsp"].presence,
latitude: school["Latitude"].presence,
longitude: school["Longitude"].presence,
}

if school["TrustSchoolFlag (code)"] == SUPPORTED_BY_A_TRUST
Expand Down
55 changes: 53 additions & 2 deletions app/services/gias/csv_transformer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,16 @@ def call
output_csv = CSV.new(output_file)

CSV.new(input_file, headers: true, return_headers: true).each do |row|
if row.header_row? || school_in_scope?(row)
output_csv << row
if row.header_row?
output_csv << header_row(row)
elsif school_in_scope?(row)
output_csv << school_row(row)
end
end

# Close the IO handle to the cs2cs process
coordinate_transformer.close

# Rewind the file so it's ready for reading
output_file.rewind
output_file
Expand All @@ -25,6 +30,23 @@ def call

attr_reader :input_file, :output_file

def header_row(row)
row.headers + %w[Latitude Longitude]
end

def school_row(row)
coordinates = coordinate_transformer.transform(
easting: row.fetch("Easting"),
northing: row.fetch("Northing"),
)

row.fields + [coordinates[:latitude], coordinates[:longitude]]
end

def coordinate_transformer
@coordinate_transformer ||= CoordinateTransformer.new
end

# The 'School Placements' and 'Funding Mentors' services only need to know
# about schools in England. Closed schools and those outside of England can
# be filtered out from the CSV.
Expand All @@ -51,5 +73,34 @@ def in_england?
!NON_ENGLISH_ESTABLISHMENTS.include? row.fetch("TypeOfEstablishment (code)")
end
end

class CoordinateTransformer
# GIAS provides coordinates in British National Grid Easting/Northing format.
# https://epsg.io/27700
SOURCE_CRS = "EPSG:27700".freeze

# We need coordinates in WGS84 format, the standard Latitude/Longitude
# coordinate system used in GPS and online mapping tools.
# https://epsg.io/4326
TARGET_CRS = "EPSG:4326".freeze

attr_reader :cs2cs

delegate :close, to: :cs2cs

def initialize
# cs2cs is provided as part of the PROJ library
# Mac: brew install proj
# Linux (Debian): apt-get install proj-bin
# Usage: https://proj.org/en/9.4/apps/cs2cs.html
@cs2cs = IO.popen(["cs2cs", "-d", "10", SOURCE_CRS, TARGET_CRS], "r+")
end

def transform(easting:, northing:)
cs2cs.write "#{easting} #{northing}\n"
output = cs2cs.gets.split(" ")
{ latitude: output[0], longitude: output[1] }
end
end
end
end
8 changes: 4 additions & 4 deletions spec/fixtures/gias/gias_subset_transformed.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
URN,LA (code),LA (name),EstablishmentNumber,EstablishmentName,TypeOfEstablishment (code),TypeOfEstablishment (name),EstablishmentTypeGroup (code),EstablishmentTypeGroup (name),EstablishmentStatus (code),EstablishmentStatus (name),ReasonEstablishmentOpened (code),ReasonEstablishmentOpened (name),OpenDate,ReasonEstablishmentClosed (code),ReasonEstablishmentClosed (name),CloseDate,PhaseOfEducation (code),PhaseOfEducation (name),StatutoryLowAge,StatutoryHighAge,Boarders (code),Boarders (name),NurseryProvision (name),OfficialSixthForm (code),OfficialSixthForm (name),Gender (code),Gender (name),ReligiousCharacter (code),ReligiousCharacter (name),ReligiousEthos (name),Diocese (code),Diocese (name),AdmissionsPolicy (code),AdmissionsPolicy (name),SchoolCapacity,SpecialClasses (code),SpecialClasses (name),CensusDate,NumberOfPupils,NumberOfBoys,NumberOfGirls,PercentageFSM,TrustSchoolFlag (code),TrustSchoolFlag (name),Trusts (code),Trusts (name),SchoolSponsorFlag (name),SchoolSponsors (name),FederationFlag (name),Federations (code),Federations (name),UKPRN,FEHEIdentifier,FurtherEducationType (name),OfstedLastInsp,OfstedSpecialMeasures (code),OfstedSpecialMeasures (name),LastChangedDate,Street,Locality,Address3,Town,County (name),Postcode,SchoolWebsite,TelephoneNum,HeadTitle (name),HeadFirstName,HeadLastName,HeadPreferredJobTitle,BSOInspectorateName (name),InspectorateReport,DateOfLastInspectionVisit,NextInspectionVisit,TeenMoth (name),TeenMothPlaces,CCF (name),SENPRU (name),EBD (name),PlacesPRU,FTProv (name),EdByOther (name),Section41Approved (name),SEN1 (name),SEN2 (name),SEN3 (name),SEN4 (name),SEN5 (name),SEN6 (name),SEN7 (name),SEN8 (name),SEN9 (name),SEN10 (name),SEN11 (name),SEN12 (name),SEN13 (name),TypeOfResourcedProvision (name),ResourcedProvisionOnRoll,ResourcedProvisionCapacity,SenUnitOnRoll,SenUnitCapacity,GOR (code),GOR (name),DistrictAdministrative (code),DistrictAdministrative (name),AdministrativeWard (code),AdministrativeWard (name),ParliamentaryConstituency (code),ParliamentaryConstituency (name),UrbanRural (code),UrbanRural (name),GSSLACode (name),Easting,Northing,MSOA (name),LSOA (name),InspectorateName (name),SENStat,SENNoStat,BoardingEstablishment (name),PropsName,PreviousLA (code),PreviousLA (name),PreviousEstablishmentNumber,OfstedRating (name),RSCRegion (name),Country (name),UPRN,SiteName,QABName (code),QABName (name),EstablishmentAccredited (code),EstablishmentAccredited (name),QABReport,CHNumber,MSOA (code),LSOA (code),FSM,AccreditationExpiryDate
100000,201,City of London,3614,The Aldgate School,2,Voluntary aided school,4,Local authority maintained schools,1,Open,0,Not applicable,,0,Not applicable,,2,Primary,3,11,1,No boarders,Has Nursery Classes,2,Does not have a sixth form,3,Mixed,2,Church of England,Does not apply,CE23,Diocese of London,0,Not applicable,271,2,No Special Classes,19-01-2023,271,144,127,18.10,0,Not applicable,,,Not applicable,,Not under a federation,,,10079319,,Not applicable,19-04-2013,0,Not applicable,09-02-2024,St James's Passage,Duke's Place,,London,,EC3A 5DE,www.thealdgateschool.org,2072831147,Miss,Alexandra,Allan,Headteacher,Not applicable,,,,Not applicable,,Not applicable,Not applicable,Not applicable,,,Not applicable,Not applicable,,,,,,,,,,,,,,,,,,,H,London,E09000001,City of London,E05009308,Portsoken,E14000639,Cities of London and Westminster,A1,(England/Wales) Urban major conurbation,E09000001,533498,181201,City of London 001,City of London 001F,,,,,,999,,,Outstanding,North-West London and South-Central England,,200000071925,,0,Not applicable,0,Not applicable,,,E02000001,E01032739,49,
137666,878,Devon,3106,Chudleigh Knighton Church of England Primary School,34,Academy converter,10,Academies,1,Open,10,Academy Converter,01-11-2011,99,,,2,Primary,5,11,1,No boarders,No Nursery Classes,0,Not applicable,3,Mixed,2,Church of England,Does not apply,CE15,Diocese of Exeter,0,Not applicable,105,2,No Special Classes,19-01-2023,100,51,49,19.00,3,Supported by a multi-academy trust,3104,THE FIRST FEDERATION TRUST,Linked to a sponsor,The First Federation Trust,Not applicable,,,10035224,,Not applicable,09-02-2023,0,Not applicable,13-09-2023,Chudleigh Knighton,,,Newton Abbot,Devon,TQ13 0EU,http://www.chudleigh-knighton.devon.sch.uk,1626852314,Mr,Simon,Westwood,Head of Teaching & Learning,Not applicable,,,,Not applicable,,Not applicable,Not applicable,Not applicable,,,Not applicable,Not applicable,,,,,,,,,,,,,,,,,,,K,South West,E07000045,Teignbridge,E05011899,Chudleigh,E14000623,Central Devon,D1,(England/Wales) Rural town and fringe,E10000008,284509,77456,Teignbridge 004,Teignbridge 004F,,,,,,911,Pre LGR (1998) Devon,3106,Good,South-West England,,10032960701,,0,Not applicable,0,Not applicable,,,E02004204,E01020223,19,
124087,860,Staffordshire,2216,Thomas Barnes Primary School,5,Foundation school,4,Local authority maintained schools,1,Open,0,Not applicable,,0,Not applicable,,2,Primary,4,11,1,No boarders,No Nursery Classes,2,Does not have a sixth form,3,Mixed,0,Does not apply,Does not apply,0,Not applicable,0,Not applicable,105,2,No Special Classes,19-01-2023,106,53,53,9.40,1,Supported by a trust,1579,Tame Valley Co-Operative Learning Trust,Not applicable,,Not under a federation,,,10073083,,Not applicable,28-03-2014,0,Not applicable,19-01-2024,School Lane,Hopwas,,Tamworth,Staffordshire,B78 3AD,http://www.thomasbarnes.staffs.sch.uk,1827213840,Mrs,E,Tibbitts,Headteacher,Not applicable,,,,Not applicable,,Not applicable,Not applicable,Not applicable,,,Not applicable,Not applicable,,,,,,,,,,,,,,,,,,,F,West Midlands,E07000194,Lichfield,E05010672,Whittington & Streethay,E14000986,Tamworth,F1,(England/Wales) Rural hamlet and isolated dwellings,E10000028,417927,305209,Lichfield 008,Lichfield 008B,,,,,,934,Pre LGR (1997) Staffordshire,,Outstanding,West Midlands,,100031712411,,0,Not applicable,0,Not applicable,,,E02006153,E01029517,10,
URN,LA (code),LA (name),EstablishmentNumber,EstablishmentName,TypeOfEstablishment (code),TypeOfEstablishment (name),EstablishmentTypeGroup (code),EstablishmentTypeGroup (name),EstablishmentStatus (code),EstablishmentStatus (name),ReasonEstablishmentOpened (code),ReasonEstablishmentOpened (name),OpenDate,ReasonEstablishmentClosed (code),ReasonEstablishmentClosed (name),CloseDate,PhaseOfEducation (code),PhaseOfEducation (name),StatutoryLowAge,StatutoryHighAge,Boarders (code),Boarders (name),NurseryProvision (name),OfficialSixthForm (code),OfficialSixthForm (name),Gender (code),Gender (name),ReligiousCharacter (code),ReligiousCharacter (name),ReligiousEthos (name),Diocese (code),Diocese (name),AdmissionsPolicy (code),AdmissionsPolicy (name),SchoolCapacity,SpecialClasses (code),SpecialClasses (name),CensusDate,NumberOfPupils,NumberOfBoys,NumberOfGirls,PercentageFSM,TrustSchoolFlag (code),TrustSchoolFlag (name),Trusts (code),Trusts (name),SchoolSponsorFlag (name),SchoolSponsors (name),FederationFlag (name),Federations (code),Federations (name),UKPRN,FEHEIdentifier,FurtherEducationType (name),OfstedLastInsp,OfstedSpecialMeasures (code),OfstedSpecialMeasures (name),LastChangedDate,Street,Locality,Address3,Town,County (name),Postcode,SchoolWebsite,TelephoneNum,HeadTitle (name),HeadFirstName,HeadLastName,HeadPreferredJobTitle,BSOInspectorateName (name),InspectorateReport,DateOfLastInspectionVisit,NextInspectionVisit,TeenMoth (name),TeenMothPlaces,CCF (name),SENPRU (name),EBD (name),PlacesPRU,FTProv (name),EdByOther (name),Section41Approved (name),SEN1 (name),SEN2 (name),SEN3 (name),SEN4 (name),SEN5 (name),SEN6 (name),SEN7 (name),SEN8 (name),SEN9 (name),SEN10 (name),SEN11 (name),SEN12 (name),SEN13 (name),TypeOfResourcedProvision (name),ResourcedProvisionOnRoll,ResourcedProvisionCapacity,SenUnitOnRoll,SenUnitCapacity,GOR (code),GOR (name),DistrictAdministrative (code),DistrictAdministrative (name),AdministrativeWard (code),AdministrativeWard (name),ParliamentaryConstituency (code),ParliamentaryConstituency (name),UrbanRural (code),UrbanRural (name),GSSLACode (name),Easting,Northing,MSOA (name),LSOA (name),InspectorateName (name),SENStat,SENNoStat,BoardingEstablishment (name),PropsName,PreviousLA (code),PreviousLA (name),PreviousEstablishmentNumber,OfstedRating (name),RSCRegion (name),Country (name),UPRN,SiteName,QABName (code),QABName (name),EstablishmentAccredited (code),EstablishmentAccredited (name),QABReport,CHNumber,MSOA (code),LSOA (code),FSM,AccreditationExpiryDate,Latitude,Longitude
100000,201,City of London,3614,The Aldgate School,2,Voluntary aided school,4,Local authority maintained schools,1,Open,0,Not applicable,,0,Not applicable,,2,Primary,3,11,1,No boarders,Has Nursery Classes,2,Does not have a sixth form,3,Mixed,2,Church of England,Does not apply,CE23,Diocese of London,0,Not applicable,271,2,No Special Classes,19-01-2023,271,144,127,18.10,0,Not applicable,,,Not applicable,,Not under a federation,,,10079319,,Not applicable,19-04-2013,0,Not applicable,09-02-2024,St James's Passage,Duke's Place,,London,,EC3A 5DE,www.thealdgateschool.org,2072831147,Miss,Alexandra,Allan,Headteacher,Not applicable,,,,Not applicable,,Not applicable,Not applicable,Not applicable,,,Not applicable,Not applicable,,,,,,,,,,,,,,,,,,,H,London,E09000001,City of London,E05009308,Portsoken,E14000639,Cities of London and Westminster,A1,(England/Wales) Urban major conurbation,E09000001,533498,181201,City of London 001,City of London 001F,,,,,,999,,,Outstanding,North-West London and South-Central England,,200000071925,,0,Not applicable,0,Not applicable,,,E02000001,E01032739,49,,51.5139702631,-0.0775045667
137666,878,Devon,3106,Chudleigh Knighton Church of England Primary School,34,Academy converter,10,Academies,1,Open,10,Academy Converter,01-11-2011,99,,,2,Primary,5,11,1,No boarders,No Nursery Classes,0,Not applicable,3,Mixed,2,Church of England,Does not apply,CE15,Diocese of Exeter,0,Not applicable,105,2,No Special Classes,19-01-2023,100,51,49,19.00,3,Supported by a multi-academy trust,3104,THE FIRST FEDERATION TRUST,Linked to a sponsor,The First Federation Trust,Not applicable,,,10035224,,Not applicable,09-02-2023,0,Not applicable,13-09-2023,Chudleigh Knighton,,,Newton Abbot,Devon,TQ13 0EU,http://www.chudleigh-knighton.devon.sch.uk,1626852314,Mr,Simon,Westwood,Head of Teaching & Learning,Not applicable,,,,Not applicable,,Not applicable,Not applicable,Not applicable,,,Not applicable,Not applicable,,,,,,,,,,,,,,,,,,,K,South West,E07000045,Teignbridge,E05011899,Chudleigh,E14000623,Central Devon,D1,(England/Wales) Rural town and fringe,E10000008,284509,77456,Teignbridge 004,Teignbridge 004F,,,,,,911,Pre LGR (1998) Devon,3106,Good,South-West England,,10032960701,,0,Not applicable,0,Not applicable,,,E02004204,E01020223,19,,50.5853706802,-3.6327567586
124087,860,Staffordshire,2216,Thomas Barnes Primary School,5,Foundation school,4,Local authority maintained schools,1,Open,0,Not applicable,,0,Not applicable,,2,Primary,4,11,1,No boarders,No Nursery Classes,2,Does not have a sixth form,3,Mixed,0,Does not apply,Does not apply,0,Not applicable,0,Not applicable,105,2,No Special Classes,19-01-2023,106,53,53,9.40,1,Supported by a trust,1579,Tame Valley Co-Operative Learning Trust,Not applicable,,Not under a federation,,,10073083,,Not applicable,28-03-2014,0,Not applicable,19-01-2024,School Lane,Hopwas,,Tamworth,Staffordshire,B78 3AD,http://www.thomasbarnes.staffs.sch.uk,1827213840,Mrs,E,Tibbitts,Headteacher,Not applicable,,,,Not applicable,,Not applicable,Not applicable,Not applicable,,,Not applicable,Not applicable,,,,,,,,,,,,,,,,,,,F,West Midlands,E07000194,Lichfield,E05010672,Whittington & Streethay,E14000986,Tamworth,F1,(England/Wales) Rural hamlet and isolated dwellings,E10000028,417927,305209,Lichfield 008,Lichfield 008B,,,,,,934,Pre LGR (1997) Staffordshire,,Outstanding,West Midlands,,100031712411,,0,Not applicable,0,Not applicable,,,E02006153,E01029517,10,,52.6443444763,-1.7364805658
12 changes: 6 additions & 6 deletions spec/fixtures/gias/import_with_trusts_and_regions.csv
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
URN,EstablishmentName,Town,Postcode,EstablishmentStatus (code),TypeOfEstablishment (code),DistrictAdministrative (code),TrustSchoolFlag (code),Trusts (code),Trusts (name)
130,InnerLondonSchool,InnerLondonTown,Postcode,1,7,E09000007,,
131,OuterLondonSchool,OuterLondonTown,Postcode,1,7,E09000004,,
123,FringeSchool,FringeTown,Postcode,1,7,E06000039,,
132,RestOfEnglandSchool,RestOfEnglandTown,Postcode,1,7,E09000099,,
140,TrustSchool,TrustTown,Postcode,1,7,E09000007,1,12345,Department for Education Trust
URN,EstablishmentName,Town,Postcode,EstablishmentStatus (code),TypeOfEstablishment (code),DistrictAdministrative (code),TrustSchoolFlag (code),Trusts (code),Trusts (name),Latitude,Longitude
130,InnerLondonSchool,InnerLondonTown,Postcode,1,7,E09000007,,,,51.5139702631,-0.0775045667
131,OuterLondonSchool,OuterLondonTown,Postcode,1,7,E09000004,,,,,
123,FringeSchool,FringeTown,Postcode,1,7,E06000039,,,,,
132,RestOfEnglandSchool,RestOfEnglandTown,Postcode,1,7,E09000099,,,,,
140,TrustSchool,TrustTown,Postcode,1,7,E09000007,1,12345,Department for Education Trust,,
8 changes: 4 additions & 4 deletions spec/jobs/gias/sync_all_schools_job_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,10 @@
it "downloads and imports school data from GIAS" do
expect { described_class.perform_now }.to change(School, :count).from(0).to(3)

expect(School.pluck(:urn, :name)).to eq [
["100000", "The Aldgate School"],
["137666", "Chudleigh Knighton Church of England Primary School"],
["124087", "Thomas Barnes Primary School"],
expect(School.pluck(:urn, :name, :latitude, :longitude)).to eq [
["100000", "The Aldgate School", 51.5139702631, -0.0775045667],
["137666", "Chudleigh Knighton Church of England Primary School", 50.5853706802, -3.6327567586],
["124087", "Thomas Barnes Primary School", 52.6443444763, -1.7364805658],
]
end
end
Expand Down
26 changes: 26 additions & 0 deletions spec/services/gias/csv_importer_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -79,4 +79,30 @@
end
end
end

describe "geocoding schools" do
subject(:school) { School.find_by!(urn:) }

before { gias_importer }

context "when the CSV has a Latitude/Longitude for the school" do
let(:urn) { "130" }

it "geocodes the school" do
expect(school).to be_geocoded
expect(school.latitude).to eq(51.5139702631)
expect(school.longitude).to eq(-0.0775045667)
end
end

context "when the CSV doesn't provide a Latitude/Longitude for the school" do
let(:urn) { "131" }

it "imports the school but does not geocode it" do
expect(school).not_to be_geocoded
expect(school.latitude).to be_nil
expect(school.longitude).to be_nil
end
end
end
end
14 changes: 14 additions & 0 deletions spec/services/gias/csv_transformer_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,20 @@
expect(actual_output).to eq(expected_output)
end

it "converts Easting/Northing to Latitude/Longitude" do
output_csv = CSV.read(output_file, headers: true)

easting_northing = output_csv.values_at("Easting", "Northing")
latitude_longitude = output_csv.values_at("Latitude", "Longitude")

expect(easting_northing.zip(latitude_longitude).to_h).to eq({
# Easting, Northing => Latitude, Longitude
%w[533498 181201] => %w[51.5139702631 -0.0775045667],
%w[284509 77456] => %w[50.5853706802 -3.6327567586],
%w[417927 305209] => %w[52.6443444763 -1.7364805658],
})
end

it "filters out schools which are Closed or not in England" do
output_csv = CSV.read(output_file, headers: true)
urns = output_csv.values_at("URN").flatten
Expand Down

0 comments on commit c4fc0e9

Please sign in to comment.