up demo
This commit is contained in:
parent
153e9bd8f9
commit
6d755ee8dc
8 changed files with 1058 additions and 17 deletions
|
@ -1,130 +0,0 @@
|
|||
# SNCF Travaux Extractor Implementation
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the implementation of the SNCF Travaux Extractor for the OpenEventDatabase. The extractor fetches railway work schedules from the SNCF open data API and adds them to the database as events.
|
||||
|
||||
## Implementation Details
|
||||
|
||||
The extractor is implemented in the file `extractors/sncf_travaux.py`. It consists of the following components:
|
||||
|
||||
1. **API Integration**: The extractor connects to the SNCF open data API to fetch railway work schedules.
|
||||
2. **Date Conversion**: The extractor converts week numbers to dates, as the SNCF data provides the year and week number rather than explicit start and end dates.
|
||||
3. **Event Creation**: The extractor creates event objects from the SNCF data, including all required properties for the OpenEventDatabase.
|
||||
4. **Database Integration**: The extractor submits events to the OpenEventDatabase.
|
||||
|
||||
### Key Functions
|
||||
|
||||
#### `fetch_sncf_data()`
|
||||
|
||||
This function fetches railway work planning data from the SNCF open data API. It handles HTTP errors, JSON decoding errors, and checks if the response contains a 'results' field.
|
||||
|
||||
```python
|
||||
def fetch_sncf_data():
|
||||
"""
|
||||
Fetch railway work planning data from the SNCF open data API.
|
||||
|
||||
Returns:
|
||||
list: A list of railway work records.
|
||||
"""
|
||||
# Implementation details...
|
||||
```
|
||||
|
||||
#### `week_to_date(year, week_number)`
|
||||
|
||||
This function converts a year and week number to a date range (start date and end date). It handles various input formats and edge cases.
|
||||
|
||||
```python
|
||||
def week_to_date(year, week_number):
|
||||
"""
|
||||
Convert a year and week number to a date.
|
||||
|
||||
Args:
|
||||
year (str or int): The year.
|
||||
week_number (str or int): The week number (1-53).
|
||||
|
||||
Returns:
|
||||
tuple: A tuple containing (start_date, end_date) as ISO format strings.
|
||||
"""
|
||||
# Implementation details...
|
||||
```
|
||||
|
||||
#### `create_event(record)`
|
||||
|
||||
This function creates an event object from a SNCF record. It extracts relevant data from the record, converts the year and week number to start and end dates, and creates a GeoJSON Feature object with all the necessary properties.
|
||||
|
||||
```python
|
||||
def create_event(record):
|
||||
"""
|
||||
Create an event object from a SNCF record.
|
||||
|
||||
Args:
|
||||
record: A record from the SNCF API.
|
||||
|
||||
Returns:
|
||||
dict: A GeoJSON Feature representing the event.
|
||||
"""
|
||||
# Implementation details...
|
||||
```
|
||||
|
||||
#### `submit_event(event)`
|
||||
|
||||
This function submits an event to the OpenEventDatabase. It connects to the database, inserts the geometry and event data, and handles various error cases.
|
||||
|
||||
```python
|
||||
def submit_event(event):
|
||||
"""
|
||||
Submit an event to the OpenEventDatabase.
|
||||
|
||||
Args:
|
||||
event: A GeoJSON Feature representing the event.
|
||||
|
||||
Returns:
|
||||
bool: True if the event was successfully submitted, False otherwise.
|
||||
"""
|
||||
# Implementation details...
|
||||
```
|
||||
|
||||
### Event Properties
|
||||
|
||||
The events created by the extractor include the following properties:
|
||||
|
||||
- `type`: "scheduled" (as these are planned railway works)
|
||||
- `what`: "transport.railway.maintenance" (a descriptive category)
|
||||
- `what:series`: "SNCF Railway Maintenance" (to group related events)
|
||||
- `where`: The line name
|
||||
- `label`: A descriptive label
|
||||
- `description`: A detailed description of the work
|
||||
- `start` and `stop`: The start and end dates derived from the year and week number
|
||||
|
||||
Additional properties specific to railway works:
|
||||
- `line_code`: The line code
|
||||
- `work_type`: The type of work
|
||||
- `interventions`: The number of interventions
|
||||
- `start_point` and `end_point`: The start and end points (kilometer points)
|
||||
- `structure`: The managing structure
|
||||
- `source`: "SNCF Open Data"
|
||||
|
||||
## Testing
|
||||
|
||||
A test script (`test_sncf_travaux.py`) is provided to test the functionality of the extractor without actually submitting events to the database. It tests the `week_to_date()`, `create_event()`, and `fetch_sncf_data()` functions with various inputs.
|
||||
|
||||
To run the test script:
|
||||
|
||||
```bash
|
||||
python3 extractors/test_sncf_travaux.py
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To run the extractor and add SNCF railway work events to the database:
|
||||
|
||||
```bash
|
||||
./extractors/sncf_travaux.py
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- The extractor uses a placeholder location (center of France) for the event geometry. In a real implementation, you might want to geocode the line or use a predefined location.
|
||||
- The extractor assumes that the SNCF API returns data in the format described in the comments. If the API changes, the extractor may need to be updated.
|
||||
- The extractor handles various error cases, such as missing required fields, invalid week numbers, and database connection errors.
|
|
@ -284,6 +284,7 @@ def event_exists(db, properties):
|
|||
Returns:
|
||||
bool: True if the event exists, False otherwise.
|
||||
"""
|
||||
print('event: ', properties)
|
||||
try:
|
||||
cur = db.cursor()
|
||||
|
||||
|
@ -348,6 +349,7 @@ def submit_event(event):
|
|||
cur = db.cursor()
|
||||
geometry = json.dumps(event['geometry'])
|
||||
|
||||
print('event: ', event)
|
||||
# Insert the geometry into the geo table
|
||||
cur.execute("""
|
||||
INSERT INTO geo
|
||||
|
@ -361,20 +363,56 @@ def submit_event(event):
|
|||
hash_result = cur.fetchone()
|
||||
|
||||
if hash_result is None:
|
||||
# If the hash is None, get it from the database
|
||||
# If the hash is None, check if the geometry already exists in the database
|
||||
cur.execute("""
|
||||
SELECT md5(st_asewkt(geom)),
|
||||
ST_IsValid(geom),
|
||||
ST_IsValidReason(geom) from (SELECT st_geomfromgeojson(%s) as geom) as g;
|
||||
SELECT hash FROM geo
|
||||
WHERE hash = md5(st_astext(st_setsrid(st_geomfromgeojson(%s),4326)));
|
||||
""", (geometry,))
|
||||
hash_result = cur.fetchone()
|
||||
|
||||
if hash_result is None or (len(hash_result) > 1 and not hash_result[1]):
|
||||
logger.error(f"Invalid geometry for event: {properties.get('label')}")
|
||||
db.close()
|
||||
return False
|
||||
|
||||
geo_hash = hash_result[0]
|
||||
existing_hash = cur.fetchone()
|
||||
|
||||
if existing_hash:
|
||||
# Geometry already exists in the database, use its hash
|
||||
geo_hash = existing_hash[0]
|
||||
logger.info(f"Using existing geometry with hash: {geo_hash}")
|
||||
else:
|
||||
# Geometry doesn't exist, try to insert it directly
|
||||
cur.execute("""
|
||||
SELECT md5(st_astext(geom)) as hash,
|
||||
ST_IsValid(geom),
|
||||
ST_IsValidReason(geom) from (SELECT st_setsrid(st_geomfromgeojson(%s),4326) as geom) as g;
|
||||
""", (geometry,))
|
||||
hash_result = cur.fetchone()
|
||||
|
||||
if hash_result is None or not hash_result[1]:
|
||||
logger.error(f"Invalid geometry for event: {properties.get('label')}")
|
||||
if hash_result and len(hash_result) > 2:
|
||||
logger.error(f"Reason: {hash_result[2]}")
|
||||
db.close()
|
||||
return False
|
||||
|
||||
geo_hash = hash_result[0]
|
||||
|
||||
# Now insert the geometry explicitly
|
||||
cur.execute("""
|
||||
INSERT INTO geo (geom, hash, geom_center)
|
||||
VALUES (
|
||||
st_setsrid(st_geomfromgeojson(%s),4326),
|
||||
%s,
|
||||
st_centroid(st_setsrid(st_geomfromgeojson(%s),4326))
|
||||
)
|
||||
ON CONFLICT (hash) DO NOTHING;
|
||||
""", (geometry, geo_hash, geometry))
|
||||
|
||||
# Verify the geometry was inserted
|
||||
cur.execute("SELECT 1 FROM geo WHERE hash = %s", (geo_hash,))
|
||||
if cur.fetchone() is None:
|
||||
logger.error(f"Failed to insert geometry with hash: {geo_hash}")
|
||||
db.close()
|
||||
return False
|
||||
|
||||
logger.info(f"Inserted new geometry with hash: {geo_hash}")
|
||||
else:
|
||||
geo_hash = hash_result[0]
|
||||
|
||||
# Determine the bounds for the time range
|
||||
bounds = '[]' if properties['start'] == properties['stop'] else '[)'
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue