Initial commit
This commit is contained in:
12
.gitignore
vendored
Normal file
12
.gitignore
vendored
Normal file
@@ -0,0 +1,12 @@
|
||||
# --- OS / éditeur ---
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
*.swp
|
||||
*~
|
||||
.idea/
|
||||
.vscode/
|
||||
|
||||
# --- Archives ---
|
||||
*.tar.gz
|
||||
*.tgz
|
||||
*.zip
|
||||
103
README.md
Normal file
103
README.md
Normal file
@@ -0,0 +1,103 @@
|
||||
# thalim-hal-importer
|
||||
|
||||
Plugin WordPress qui importe les publications du laboratoire THALIM depuis l'archive ouverte [HAL](https://hal.science/) (structure `254015`) et les transforme en posts WordPress avec tous les champs Pods renseignés.
|
||||
|
||||
- **Version :** 2.0.0
|
||||
- **Auteur :** THALIM Dev
|
||||
- **Licence :** GPL v2 or later
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
cd wp-content/plugins
|
||||
git clone gitea@figureslibres.io:valentin_le_moign/thalim-plugin-hal-importer.git thalim-hal-importer
|
||||
```
|
||||
|
||||
Puis activer depuis l'admin WordPress. Dans le cadre du projet THALIM, le clonage est automatisé par `bootstrap.sh` du repo [`thalim-stack`](https://figureslibres.io/valentin_le_moign/thalim-stack).
|
||||
|
||||
## Utilisation
|
||||
|
||||
Une fois activé, le plugin ajoute une page d'administration : **Outils → HAL Import** (capacité requise : `edit_others_posts`).
|
||||
|
||||
La page propose deux flux d'import :
|
||||
|
||||
### 1. Aperçu live + import incrémental
|
||||
|
||||
- Filtres : plage de dates (`producedDate_s`) et auteur (idHAL d'un membre THALIM)
|
||||
- Liste les publications HAL correspondantes avec statut coloré :
|
||||
- **vert** : déjà importée (présence du meta `hal_id` côté WP)
|
||||
- **jaune** : prête à être importée (au moins un auteur HAL matche un user WordPress)
|
||||
- **rouge** : aucun auteur THALIM identifié → ignorée
|
||||
- Bouton **Importer** : crée tous les posts « prêts » en statut `pending` (à publier après relecture)
|
||||
- Cache des aperçus en transient (5 min, clé hashée sur les filtres), rafraîchissable manuellement
|
||||
- Bouton **Test API** pour vérifier la connexion
|
||||
|
||||
### 2. Import en masse via CSV (legacy SPIP)
|
||||
|
||||
Permet d'importer des publications anciennes par lots de 100 :
|
||||
|
||||
- Upload d'un CSV avec une colonne `hal_id` + d'un fichier de contexte SPIP (axes/tags/programmes/owner par publication)
|
||||
- Traitement par batchs séquentiels (cliquer plusieurs fois)
|
||||
- Rapport CSV téléchargeable en fin de file
|
||||
- Annulation possible à tout moment
|
||||
|
||||
## Mapping des types HAL → catégories WordPress
|
||||
|
||||
`DOC_TYPE_MAP` dans `includes/class-importer.php` :
|
||||
|
||||
| Type HAL | Description | Catégorie WP |
|
||||
| ------------- | ---------------------------- | ------------ |
|
||||
| `ART` | Article | `16` |
|
||||
| `COUV` | Chapitre d'ouvrage | `16` |
|
||||
| `OUV` | Ouvrage | `15` |
|
||||
| `COMM` | Communication | `13` |
|
||||
| `ISSUE` | Direction de numéro | `16` |
|
||||
| `PROCEEDINGS` | Direction d'ouvrage / actes | `15` |
|
||||
| `THESE` | Thèse | `14` |
|
||||
| `HDR` | HDR | `14` |
|
||||
| `SON` | Son | `19` |
|
||||
| `VIDEO` | Vidéo | `19` |
|
||||
|
||||
`COMM`, `THESE`, `HDR`, `SON`, `VIDEO` sont traités comme événements et utilisent le champ Pods `date_de_debut`. Les autres utilisent `datetime`.
|
||||
|
||||
## Champs HAL → champs WP
|
||||
|
||||
À l'import, chaque publication remplit :
|
||||
|
||||
- **Identification** : `hal_id`, `hal_url` (URI HAL), `lien_externe_1` (PDF si dispo) avec titre `Document HAL // HAL Document`
|
||||
- **Titre** : `post_title` (issu de HAL)
|
||||
- **Catégorie** : triple écriture (term assignment + meta `categorie` + meta `_pods_categorie`)
|
||||
- **Auteurs HAL → membres** : matching via `authIdHal_s` ↔ user meta `identifiant_hal`. Stockage en triple-pattern Pods (`membres` add_post_meta + `_pods_membres`)
|
||||
- **Date** : `date_de_debut` ou `datetime` selon le type, optionnellement backdaté sur `producedDate_s`
|
||||
- **Métadonnées** : `journal`, `editeur`, `sous-titre` (book/conference title), `lieu` (city/country pour PROCEEDINGS), `reference_bibliographique` (`citationFull_s`)
|
||||
- **Fonction** : `fonction_auteur` (varie selon doc type : « Auteur du chapitre // Chapter author », « Direction de numéro // Editor-in-Chief », etc.)
|
||||
- **Axes thématiques** : cascade `spip_context` → co-auteurs THALIM → owner. Source effective stockée dans `$importer->last_axes_source`
|
||||
- **Programmes de recherche** et **étiquettes** : depuis `spip_context` (import CSV uniquement)
|
||||
|
||||
## Dédoublonnage
|
||||
|
||||
L'import vérifie le meta `hal_id` avant chaque insertion : une publication ne peut pas être importée deux fois. Le `is_imported($hal_id)` est aussi affiché en colonne de statut dans l'aperçu.
|
||||
|
||||
## Prérequis
|
||||
|
||||
- WordPress 6.0+
|
||||
- PHP 7.4+
|
||||
- Plugin **Pods** (le pod `post` et le champ user `identifiant_hal`)
|
||||
- IDs de catégorie WordPress conformes au mapping (8/13/14/15/16/19) — codés en dur dans `DOC_TYPE_MAP`
|
||||
|
||||
## Structure
|
||||
|
||||
```
|
||||
.
|
||||
├── thalim-hal-importer.php # point d'entrée, constantes, bootstrap
|
||||
└── includes/
|
||||
├── class-hal-api.php # client API HAL (fetch_publications, fetch_by_hal_ids)
|
||||
├── class-admin-page.php # UI Tools > HAL Import (aperçu + CSV)
|
||||
└── class-importer.php # mapping HAL → posts WP (triple-storage, axes cascade)
|
||||
```
|
||||
|
||||
## API HAL
|
||||
|
||||
- Base : `https://api.archives-ouvertes.fr/search/`
|
||||
- Structure THALIM : `254015`
|
||||
- Documentation : <https://api.archives-ouvertes.fr/docs>
|
||||
750
includes/class-admin-page.php
Normal file
750
includes/class-admin-page.php
Normal file
@@ -0,0 +1,750 @@
|
||||
<?php
|
||||
/**
|
||||
* Admin Page Class - Handles the admin interface
|
||||
*/
|
||||
|
||||
if (!defined('ABSPATH')) {
|
||||
exit;
|
||||
}
|
||||
|
||||
class Thalim_HAL_Admin_Page {
|
||||
|
||||
private $api;
|
||||
private $message = null;
|
||||
private $wp_users_by_hal_id = null; // Cache: normalized_hal_id => ['id' => int, 'name' => string]
|
||||
|
||||
// Document type labels
|
||||
private const DOC_TYPE_LABELS = [
|
||||
'ART' => 'Article',
|
||||
'COUV' => "Chapitre d'ouvrage",
|
||||
'OUV' => 'Ouvrage',
|
||||
'COMM' => 'Communication',
|
||||
'ISSUE' => 'Direction de numéro',
|
||||
'PROCEEDINGS' => 'Colloque',
|
||||
'THESE' => 'Thèse',
|
||||
'HDR' => 'HDR',
|
||||
'SON' => 'Son',
|
||||
'VIDEO' => 'Vidéo',
|
||||
];
|
||||
|
||||
public function __construct() {
|
||||
$this->api = new Thalim_HAL_API();
|
||||
}
|
||||
|
||||
public function render() {
|
||||
if (!current_user_can('edit_others_posts')) {
|
||||
wp_die('Unauthorized');
|
||||
}
|
||||
$this->handle_actions();
|
||||
echo '<div class="wrap"><h1>THALIM HAL Importer</h1>';
|
||||
$this->render_styles();
|
||||
$this->render_message();
|
||||
$this->render_config();
|
||||
$this->render_preview();
|
||||
$this->render_csv_import();
|
||||
echo '</div>';
|
||||
}
|
||||
|
||||
private function handle_actions() {
|
||||
if (!isset($_POST['thalim_hal_action'])) return;
|
||||
if (!wp_verify_nonce($_POST['_wpnonce'] ?? '', 'thalim_hal_action')) {
|
||||
$this->message = ['error', 'Security check failed.'];
|
||||
return;
|
||||
}
|
||||
$action = sanitize_text_field($_POST['thalim_hal_action']);
|
||||
|
||||
if ($action === 'test_api') {
|
||||
$result = $this->api->test_connection();
|
||||
$this->message = is_wp_error($result)
|
||||
? ['error', 'API Error: ' . $result->get_error_message()]
|
||||
: ['success', "Connection OK! Found {$result['total']} publications."];
|
||||
}
|
||||
|
||||
if ($action === 'refresh') {
|
||||
// Clear all preview transients (they are keyed by date range hash)
|
||||
global $wpdb;
|
||||
$wpdb->query("DELETE FROM {$wpdb->options} WHERE option_name LIKE '_transient_thalim_hal_preview_%'");
|
||||
$wpdb->query("DELETE FROM {$wpdb->options} WHERE option_name LIKE '_transient_timeout_thalim_hal_preview_%'");
|
||||
$this->message = ['success', 'Preview data refreshed from HAL API.'];
|
||||
}
|
||||
|
||||
if ($action === 'import_pending') {
|
||||
$this->handle_import();
|
||||
}
|
||||
|
||||
if ($action === 'csv_upload') $this->handle_csv_upload();
|
||||
if ($action === 'csv_batch') $this->handle_csv_batch();
|
||||
if ($action === 'csv_cancel') $this->handle_csv_cancel();
|
||||
if ($action === 'csv_download_report') $this->handle_csv_download_report();
|
||||
}
|
||||
|
||||
/**
|
||||
* Handle bulk import of ready publications as pending posts.
|
||||
* Uses cached raw HAL docs to avoid a second outbound API call.
|
||||
*/
|
||||
private function handle_import() {
|
||||
$date_from = sanitize_text_field($_POST['hal_date_from'] ?? '');
|
||||
$date_to = sanitize_text_field($_POST['hal_date_to'] ?? '');
|
||||
$author_hal_id = sanitize_text_field($_POST['hal_author_id'] ?? '');
|
||||
|
||||
// Reuse the cached preview data — raw_docs are stored alongside processed docs
|
||||
$preview = $this->get_preview_data($date_from, $date_to, $author_hal_id);
|
||||
if (is_wp_error($preview)) {
|
||||
$this->message = ['error', 'API Error: ' . $preview->get_error_message()];
|
||||
return;
|
||||
}
|
||||
|
||||
$raw_docs = $preview['raw_docs'] ?? [];
|
||||
if (empty($raw_docs)) {
|
||||
$this->message = ['warning', 'Aucune publication dans le cache. Utilisez Filtrer pour charger les données d\'abord.'];
|
||||
return;
|
||||
}
|
||||
|
||||
$this->load_wp_users_hal_ids();
|
||||
$importer = new Thalim_HAL_Importer_Logic();
|
||||
$imported = 0;
|
||||
$skipped = 0;
|
||||
$errors = [];
|
||||
|
||||
foreach ($raw_docs as $doc) {
|
||||
$hal_id = $doc['halId_s'] ?? '';
|
||||
$author_hal_ids = $doc['authIdHal_s'] ?? [];
|
||||
$matched_users = $this->match_authors_to_users($author_hal_ids);
|
||||
|
||||
if (empty($matched_users) || $importer->is_imported($hal_id)) {
|
||||
$skipped++;
|
||||
continue;
|
||||
}
|
||||
|
||||
$post_id = $importer->import($doc, $this->wp_users_by_hal_id);
|
||||
if (is_wp_error($post_id)) {
|
||||
$errors[] = $hal_id . ': ' . $post_id->get_error_message();
|
||||
} else {
|
||||
$imported++;
|
||||
}
|
||||
}
|
||||
|
||||
$msg = sprintf('%d publication(s) importée(s) en statut "En attente".', $imported);
|
||||
if ($skipped) $msg .= sprintf(' %d ignorée(s) (déjà importées ou sans membre THALIM correspondant).', $skipped);
|
||||
if (!empty($errors)) $msg .= ' Erreurs : ' . implode('; ', $errors);
|
||||
|
||||
$this->message = [empty($errors) ? 'success' : 'warning', $msg];
|
||||
}
|
||||
|
||||
private function render_styles() {
|
||||
?>
|
||||
<style>
|
||||
.hal-status-imported { background-color: #d4edda !important; }
|
||||
.hal-status-ready { background-color: #fff3cd !important; }
|
||||
.hal-status-blocked { background-color: #f8d7da !important; }
|
||||
.hal-summary { display: flex; gap: 20px; padding: 15px; background: #f8f9fa; border-radius: 4px; margin-bottom: 20px; }
|
||||
.hal-summary-item { padding: 10px 15px; border-radius: 4px; text-align: center; }
|
||||
.hal-summary-item.total { background: #e3f2fd; border-left: 4px solid #2196f3; }
|
||||
.hal-summary-item.imported { background: #d4edda; border-left: 4px solid #28a745; }
|
||||
.hal-summary-item.ready { background: #fff3cd; border-left: 4px solid #ffc107; }
|
||||
.hal-summary-item.blocked { background: #f8d7da; border-left: 4px solid #dc3545; }
|
||||
.hal-summary-item strong { display: block; font-size: 24px; }
|
||||
.hal-type-badge { padding: 2px 8px; border-radius: 3px; font-size: 11px; font-weight: 500; }
|
||||
.hal-type-ART, .hal-type-ISSUE { background: #e3f2fd; color: #1565c0; }
|
||||
.hal-type-COUV, .hal-type-OUV { background: #f3e5f5; color: #7b1fa2; }
|
||||
.hal-type-PROCEEDINGS { background: #e8f5e9; color: #2e7d32; }
|
||||
.hal-type-THESE, .hal-type-HDR { background: #fff3e0; color: #e65100; }
|
||||
.hal-type-SON, .hal-type-VIDEO { background: #fce4ec; color: #c62828; }
|
||||
.hal-users { font-size: 11px; color: #666; }
|
||||
.hal-users-matched { color: #28a745; font-weight: 500; }
|
||||
.hal-users-none { color: #dc3545; font-style: italic; }
|
||||
.hal-preview-table th { white-space: nowrap; }
|
||||
.hal-preview-table td { vertical-align: top; }
|
||||
</style>
|
||||
<?php
|
||||
}
|
||||
|
||||
private function render_message() {
|
||||
if (!$this->message) return;
|
||||
printf('<div class="notice notice-%s is-dismissible"><p>%s</p></div>',
|
||||
esc_attr($this->message[0]), esc_html($this->message[1]));
|
||||
}
|
||||
|
||||
private function render_config() {
|
||||
?>
|
||||
<div class="card" style="max-width:800px;margin-bottom:20px">
|
||||
<h2>Configuration</h2>
|
||||
<table class="form-table">
|
||||
<tr><th>Structure ID</th><td><code><?php echo THALIM_HAL_STRUCT_ID; ?></code> (THALIM)</td></tr>
|
||||
<tr><th>Document Types</th><td><?php echo implode(', ', THALIM_HAL_DOC_TYPES); ?></td></tr>
|
||||
<tr><th>API Endpoint</th><td><code style="font-size:11px;word-break:break-all;"><?php echo esc_html($this->api->get_api_url(10)); ?></code></td></tr>
|
||||
</table>
|
||||
<form method="post" style="margin-top:15px">
|
||||
<?php wp_nonce_field('thalim_hal_action'); ?>
|
||||
<input type="hidden" name="thalim_hal_action" value="test_api">
|
||||
<button class="button button-primary">Test API Connection</button>
|
||||
</form>
|
||||
</div>
|
||||
<?php
|
||||
}
|
||||
|
||||
private function render_preview() {
|
||||
// Read filters from POST (after submit) or GET (page reload with state)
|
||||
$date_from = sanitize_text_field($_POST['hal_date_from'] ?? $_GET['hal_date_from'] ?? '');
|
||||
$date_to = sanitize_text_field($_POST['hal_date_to'] ?? $_GET['hal_date_to'] ?? '');
|
||||
$author_hal_id = sanitize_text_field($_POST['hal_author_id'] ?? $_GET['hal_author_id'] ?? '');
|
||||
|
||||
// Users must be loaded before rendering the dropdown
|
||||
$this->load_wp_users_hal_ids();
|
||||
|
||||
$preview = $this->get_preview_data($date_from, $date_to, $author_hal_id);
|
||||
$ready_count = is_wp_error($preview) ? 0 : $preview['stats']['ready'];
|
||||
?>
|
||||
<div class="card" style="max-width:100%;margin-bottom:20px">
|
||||
<h2>Import Preview</h2>
|
||||
|
||||
<form method="post" style="margin-bottom:20px;display:flex;align-items:center;gap:15px;flex-wrap:wrap">
|
||||
<?php wp_nonce_field('thalim_hal_action'); ?>
|
||||
|
||||
<label style="font-weight:600">Depuis
|
||||
<input type="date" name="hal_date_from" value="<?php echo esc_attr($date_from); ?>" style="width:auto">
|
||||
</label>
|
||||
|
||||
<label style="font-weight:600">Jusqu'au
|
||||
<input type="date" name="hal_date_to" value="<?php echo esc_attr($date_to); ?>" style="width:auto">
|
||||
</label>
|
||||
|
||||
<label style="font-weight:600">Auteur
|
||||
<select name="hal_author_id" style="max-width:220px">
|
||||
<option value="">— Tous —</option>
|
||||
<?php foreach ($this->wp_users_by_hal_id as $user): ?>
|
||||
<option value="<?php echo esc_attr($user['hal_id']); ?>"
|
||||
<?php selected($author_hal_id, $user['hal_id']); ?>>
|
||||
<?php echo esc_html($user['name']); ?>
|
||||
</option>
|
||||
<?php endforeach; ?>
|
||||
</select>
|
||||
</label>
|
||||
|
||||
<button class="button button-secondary" name="thalim_hal_action" value="filter">Filtrer</button>
|
||||
<button class="button button-secondary" name="thalim_hal_action" value="refresh" style="margin-left:5px">Rafraîchir</button>
|
||||
<small style="color:#666">Mis en cache 5 min</small>
|
||||
|
||||
<span style="flex:1"></span>
|
||||
|
||||
<button class="button button-primary" name="thalim_hal_action" value="import_pending"
|
||||
<?php if ($ready_count === 0): ?>disabled title="Aucune publication prête à importer"<?php endif; ?>>
|
||||
Importer <?php echo $ready_count; ?> publication(s) (En attente)
|
||||
</button>
|
||||
</form>
|
||||
|
||||
<?php if (is_wp_error($preview)): ?>
|
||||
<div class="notice notice-error"><p><?php echo esc_html($preview->get_error_message()); ?></p></div>
|
||||
<?php else: ?>
|
||||
<?php $this->render_wp_users_debug(); ?>
|
||||
<?php $this->render_summary($preview['stats']); ?>
|
||||
<?php $this->render_preview_table($preview['docs']); ?>
|
||||
<?php $this->render_legend(); ?>
|
||||
<?php endif; ?>
|
||||
</div>
|
||||
<?php
|
||||
}
|
||||
|
||||
private function render_summary($stats) {
|
||||
?>
|
||||
<div class="hal-summary">
|
||||
<div class="hal-summary-item total">
|
||||
<strong><?php echo number_format($stats['total']); ?></strong>
|
||||
<span>Total in HAL</span>
|
||||
</div>
|
||||
<div class="hal-summary-item imported">
|
||||
<strong><?php echo $stats['imported']; ?></strong>
|
||||
<span>Already Imported</span>
|
||||
</div>
|
||||
<div class="hal-summary-item ready">
|
||||
<strong><?php echo $stats['ready']; ?></strong>
|
||||
<span>Ready to Import</span>
|
||||
</div>
|
||||
<div class="hal-summary-item blocked">
|
||||
<strong><?php echo $stats['blocked']; ?></strong>
|
||||
<span>No Matched User</span>
|
||||
</div>
|
||||
</div>
|
||||
<?php
|
||||
}
|
||||
|
||||
private function render_preview_table($docs) {
|
||||
if (empty($docs)) {
|
||||
echo '<p>No publications found.</p>';
|
||||
return;
|
||||
}
|
||||
?>
|
||||
<table class="widefat hal-preview-table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Statut</th>
|
||||
<th>HAL ID</th>
|
||||
<th>Titre</th>
|
||||
<th>Type</th>
|
||||
<th>Auteurs</th>
|
||||
<th>IDs HAL auteurs</th>
|
||||
<th>Date</th>
|
||||
<th>Membres THALIM</th>
|
||||
<th>Lien HAL</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<?php foreach ($docs as $doc): ?>
|
||||
<tr class="<?php echo esc_attr($this->get_row_class($doc)); ?>">
|
||||
<td><?php echo $this->get_status_icon($doc); ?></td>
|
||||
<td><code style="font-size:11px"><?php echo esc_html($doc['hal_id']); ?></code></td>
|
||||
<td>
|
||||
<strong><?php echo esc_html($doc['title']); ?></strong>
|
||||
<?php if ($doc['journal']): ?>
|
||||
<br><small style="color:#666"><?php echo esc_html($doc['journal']); ?></small>
|
||||
<?php endif; ?>
|
||||
</td>
|
||||
<td>
|
||||
<span class="hal-type-badge hal-type-<?php echo esc_attr($doc['type']); ?>">
|
||||
<?php echo esc_html(self::DOC_TYPE_LABELS[$doc['type']] ?? $doc['type']); ?>
|
||||
</span>
|
||||
</td>
|
||||
<td class="hal-users">
|
||||
<?php echo esc_html(implode(', ', $doc['authors'])); ?>
|
||||
</td>
|
||||
<td class="hal-users">
|
||||
<?php if (!empty($doc['author_hal_ids'])): ?>
|
||||
<?php foreach ($doc['author_hal_ids'] as $aid):
|
||||
$normalized = strtolower(trim($aid));
|
||||
$is_match = isset($this->wp_users_by_hal_id[$normalized]);
|
||||
?>
|
||||
<code style="font-size:10px;<?php echo $is_match ? 'background:#d4edda;' : ''; ?>"><?php echo esc_html($aid); ?></code>
|
||||
<?php endforeach; ?>
|
||||
<?php else: ?>
|
||||
<span style="color:#999;font-style:italic">aucun</span>
|
||||
<?php endif; ?>
|
||||
</td>
|
||||
<td><?php echo esc_html($doc['publication_date'] ?: ($doc['produced_date'] ?: '-')); ?></td>
|
||||
<td>
|
||||
<?php if (!empty($doc['matched_users'])): ?>
|
||||
<span class="hal-users-matched">
|
||||
<?php echo esc_html(implode(', ', $doc['matched_users'])); ?>
|
||||
</span>
|
||||
<?php else: ?>
|
||||
<span class="hal-users-none">Aucun</span>
|
||||
<?php endif; ?>
|
||||
</td>
|
||||
<td>
|
||||
<a href="<?php echo esc_url($doc['url']); ?>" target="_blank" class="button button-small">Voir sur HAL</a>
|
||||
</td>
|
||||
</tr>
|
||||
<?php endforeach; ?>
|
||||
</tbody>
|
||||
</table>
|
||||
<?php
|
||||
}
|
||||
|
||||
private function render_legend() {
|
||||
?>
|
||||
<div style="margin-top:15px;padding:10px;background:#f8f9fa;border-radius:4px;font-size:12px">
|
||||
<strong>Légende :</strong>
|
||||
<span style="margin-left:15px;"><span style="background:#d4edda;padding:2px 8px;border-radius:3px;">✓ Importé</span></span>
|
||||
<span style="margin-left:15px;"><span style="background:#fff3cd;padding:2px 8px;border-radius:3px;">★ Prêt</span> Membre THALIM identifié</span>
|
||||
<span style="margin-left:15px;"><span style="background:#f8d7da;padding:2px 8px;border-radius:3px;">✗ Bloqué</span> Aucun membre THALIM ne correspond aux IDs auteurs HAL</span>
|
||||
</div>
|
||||
<?php
|
||||
}
|
||||
|
||||
private function render_wp_users_debug() {
|
||||
$this->load_wp_users_hal_ids();
|
||||
if (empty($this->wp_users_by_hal_id)) {
|
||||
echo '<div class="notice notice-warning"><p>Aucun utilisateur WordPress n\'a le champ <code>identifiant_hal</code> renseigné.</p></div>';
|
||||
return;
|
||||
}
|
||||
?>
|
||||
<details style="margin-bottom:15px;background:#f0f6fc;padding:10px;border-radius:4px;border-left:4px solid #0073aa">
|
||||
<summary style="cursor:pointer;font-weight:bold">
|
||||
Utilisateurs WordPress avec identifiant HAL (<?php echo count($this->wp_users_by_hal_id); ?> utilisateurs) — Cliquer pour déplier
|
||||
</summary>
|
||||
<table class="widefat" style="margin-top:10px;font-size:12px">
|
||||
<thead><tr><th>Utilisateur</th><th>Identifiant HAL</th><th>Debug (brut)</th><th>Modifier</th></tr></thead>
|
||||
<tbody>
|
||||
<?php foreach ($this->wp_users_by_hal_id as $hal_id => $user): ?>
|
||||
<tr>
|
||||
<td><?php echo esc_html($user['name']); ?> <small style="color:#999">(ID : <?php echo esc_html($user['id']); ?>)</small></td>
|
||||
<td><code><?php echo esc_html($hal_id); ?></code></td>
|
||||
<td><code style="background:#ffe;font-size:10px">"<?php echo esc_html($hal_id); ?>" (<?php echo strlen($hal_id); ?> car.)</code></td>
|
||||
<td><a href="<?php echo esc_url(get_edit_user_link($user['id'])); ?>" target="_blank" class="button button-small">Modifier</a></td>
|
||||
</tr>
|
||||
<?php endforeach; ?>
|
||||
</tbody>
|
||||
</table>
|
||||
</details>
|
||||
<?php
|
||||
}
|
||||
|
||||
private function get_preview_data($date_from = '', $date_to = '', $author_hal_id = '') {
|
||||
$cache_key = 'thalim_hal_preview_' . md5($date_from . '|' . $date_to . '|' . $author_hal_id);
|
||||
$cached = get_transient($cache_key);
|
||||
if ($cached !== false) return $cached;
|
||||
|
||||
$rows = ($date_from || $date_to || $author_hal_id) ? 200 : 50;
|
||||
$result = $this->api->fetch_publications($rows, 0, 'producedDate_tdate desc', $date_from, $date_to, $author_hal_id);
|
||||
if (is_wp_error($result)) return $result;
|
||||
|
||||
$importer = new Thalim_HAL_Importer_Logic();
|
||||
$this->load_wp_users_hal_ids();
|
||||
|
||||
$preview = [
|
||||
'stats' => [
|
||||
'total' => $result['response']['numFound'] ?? 0,
|
||||
'imported' => 0,
|
||||
'ready' => 0,
|
||||
'blocked' => 0
|
||||
],
|
||||
'docs' => [],
|
||||
'raw_docs' => [], // Raw HAL docs kept for import, avoids a second API call
|
||||
];
|
||||
|
||||
foreach ($result['response']['docs'] ?? [] as $doc) {
|
||||
$hal_id = $doc['halId_s'] ?? '';
|
||||
$is_imported = $importer->is_imported($hal_id);
|
||||
$author_hal_ids = $doc['authIdHal_s'] ?? [];
|
||||
$matched_users = $this->match_authors_to_users($author_hal_ids);
|
||||
$has_match = !empty($matched_users);
|
||||
|
||||
// Update stats
|
||||
if ($is_imported) {
|
||||
$preview['stats']['imported']++;
|
||||
} elseif ($has_match) {
|
||||
$preview['stats']['ready']++;
|
||||
} else {
|
||||
$preview['stats']['blocked']++;
|
||||
}
|
||||
|
||||
$preview['docs'][] = [
|
||||
'hal_id' => $hal_id,
|
||||
'title' => $doc['title_s'][0] ?? 'N/A',
|
||||
'type' => $doc['docType_s'] ?? '',
|
||||
'authors' => $doc['authFullName_s'] ?? [],
|
||||
'author_hal_ids' => $author_hal_ids,
|
||||
'publication_date' => $doc['publicationDate_s'] ?? '',
|
||||
'produced_date' => $doc['submittedDate_s'] ?? '',
|
||||
'journal' => $doc['journalTitle_s'] ?? $doc['bookTitle_s'] ?? '',
|
||||
'url' => $doc['uri_s'] ?? '',
|
||||
'is_imported' => $is_imported,
|
||||
'matched_users' => $matched_users,
|
||||
'has_match' => $has_match,
|
||||
];
|
||||
$preview['raw_docs'][] = $doc; // Full HAL doc kept for import
|
||||
}
|
||||
|
||||
set_transient($cache_key, $preview, 300);
|
||||
return $preview;
|
||||
}
|
||||
|
||||
/**
|
||||
* Load all WordPress users with HAL IDs into cache.
|
||||
* Stores: normalized_hal_id => ['id' => int, 'name' => string]
|
||||
*/
|
||||
private function load_wp_users_hal_ids() {
|
||||
if ($this->wp_users_by_hal_id !== null) return;
|
||||
|
||||
$this->wp_users_by_hal_id = [];
|
||||
$users = get_users([
|
||||
'meta_key' => 'identifiant_hal',
|
||||
'meta_compare' => 'EXISTS'
|
||||
]);
|
||||
|
||||
foreach ($users as $user) {
|
||||
$hal_id = get_user_meta($user->ID, 'identifiant_hal', true);
|
||||
if (!empty($hal_id)) {
|
||||
$normalized = strtolower(trim($hal_id));
|
||||
$this->wp_users_by_hal_id[$normalized] = [
|
||||
'id' => $user->ID,
|
||||
'name' => $user->display_name,
|
||||
'hal_id' => trim($hal_id), // original value for API filter
|
||||
];
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Match HAL author IDs to WordPress users.
|
||||
* Returns array of display names (for preview display).
|
||||
*/
|
||||
private function match_authors_to_users($author_hal_ids) {
|
||||
$matched = [];
|
||||
foreach ($author_hal_ids as $hal_id) {
|
||||
$normalized = strtolower(trim($hal_id));
|
||||
if (isset($this->wp_users_by_hal_id[$normalized])) {
|
||||
$matched[] = $this->wp_users_by_hal_id[$normalized]['name'];
|
||||
}
|
||||
}
|
||||
return $matched;
|
||||
}
|
||||
|
||||
// ========================================================================
|
||||
// CSV bulk import (phase 2 — legacy publications from SPIP)
|
||||
// ========================================================================
|
||||
|
||||
private const CSV_QUEUE_OPTION = 'thalim_hal_csv_queue';
|
||||
private const CSV_BATCH_SIZE = 100;
|
||||
|
||||
private function render_csv_import() {
|
||||
$queue = get_option(self::CSV_QUEUE_OPTION, null);
|
||||
?>
|
||||
<div class="card" style="max-width:100%;margin-bottom:20px">
|
||||
<h2>Import en masse depuis CSV</h2>
|
||||
<p>
|
||||
Uploader le couple <code>hal-to-import.csv</code> + <code>hal-to-import-context.json</code>
|
||||
(généré par <code>php scripts/prepare-csv-context.php</code>) pour importer les publications legacy.
|
||||
Chaque batch traite <?php echo self::CSV_BATCH_SIZE; ?> publications — cliquer plusieurs fois jusqu'à terminaison.
|
||||
</p>
|
||||
|
||||
<?php if (!$queue): ?>
|
||||
<form method="post" enctype="multipart/form-data">
|
||||
<?php wp_nonce_field('thalim_hal_action'); ?>
|
||||
<input type="hidden" name="thalim_hal_action" value="csv_upload">
|
||||
<table class="form-table">
|
||||
<tr>
|
||||
<th>Fichier CSV</th>
|
||||
<td><input type="file" name="csv_file" accept=".csv" required></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Fichier contexte (JSON)</th>
|
||||
<td><input type="file" name="ctx_file" accept=".json" required>
|
||||
<p class="description">Généré par <code>scripts/prepare-csv-context.php</code>.</p></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Statut des posts</th>
|
||||
<td>
|
||||
<label><input type="radio" name="post_status" value="publish" checked> Publié directement</label>
|
||||
<label style="margin-left:15px"><input type="radio" name="post_status" value="pending"> En attente (pour relecture)</label>
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Date du post WP</th>
|
||||
<td>
|
||||
<label><input type="checkbox" name="backdate_post" value="1" checked>
|
||||
Utiliser la date HAL (<code>producedDate_s</code>) comme <code>post_date</code></label>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
<p><button class="button button-primary">Charger le CSV et préparer l'import</button></p>
|
||||
</form>
|
||||
<?php else: ?>
|
||||
<?php $this->render_csv_progress($queue); ?>
|
||||
<?php endif; ?>
|
||||
</div>
|
||||
<?php
|
||||
}
|
||||
|
||||
private function render_csv_progress(array $queue): void {
|
||||
$total = (int) ($queue['total'] ?? 0);
|
||||
$done = (int) ($queue['done'] ?? 0);
|
||||
$remaining = max(0, $total - $done);
|
||||
$pct = $total > 0 ? round(100 * $done / $total, 1) : 0;
|
||||
$report_ct = count($queue['report'] ?? []);
|
||||
?>
|
||||
<div style="padding:15px;background:#f0f6fc;border-left:4px solid #0073aa">
|
||||
<p><strong>File d'attente active</strong> — statut cible : <code><?php echo esc_html($queue['status']); ?></code>
|
||||
— backdate : <?php echo !empty($queue['backdate']) ? 'oui' : 'non'; ?></p>
|
||||
<p>
|
||||
<strong><?php echo number_format($done); ?></strong> / <?php echo number_format($total); ?>
|
||||
publications traitées (<?php echo esc_html($pct); ?>%)
|
||||
— <?php echo number_format($remaining); ?> restantes
|
||||
</p>
|
||||
<?php if (!empty($queue['updated_at'])): ?>
|
||||
<p><small>Dernière mise à jour : <?php echo esc_html($queue['updated_at']); ?></small></p>
|
||||
<?php endif; ?>
|
||||
|
||||
<?php if (!empty($queue['last_error'])): ?>
|
||||
<div class="notice notice-error" style="margin:10px 0"><p>
|
||||
Erreur dernier batch : <?php echo esc_html($queue['last_error']); ?>
|
||||
</p></div>
|
||||
<?php endif; ?>
|
||||
|
||||
<form method="post" style="display:inline-block;margin-right:10px">
|
||||
<?php wp_nonce_field('thalim_hal_action'); ?>
|
||||
<input type="hidden" name="thalim_hal_action" value="csv_batch">
|
||||
<button class="button button-primary" <?php if ($remaining === 0) echo 'disabled'; ?>>
|
||||
Traiter le prochain batch (<?php echo min(self::CSV_BATCH_SIZE, $remaining); ?>)
|
||||
</button>
|
||||
</form>
|
||||
|
||||
<?php if ($report_ct > 0): ?>
|
||||
<form method="post" style="display:inline-block;margin-right:10px">
|
||||
<?php wp_nonce_field('thalim_hal_action'); ?>
|
||||
<input type="hidden" name="thalim_hal_action" value="csv_download_report">
|
||||
<button class="button">Télécharger le rapport CSV (<?php echo $report_ct; ?> lignes)</button>
|
||||
</form>
|
||||
<?php endif; ?>
|
||||
|
||||
<form method="post" style="display:inline-block" onsubmit="return confirm('Vraiment annuler l\'import en cours ?');">
|
||||
<?php wp_nonce_field('thalim_hal_action'); ?>
|
||||
<input type="hidden" name="thalim_hal_action" value="csv_cancel">
|
||||
<button class="button button-link-delete">Annuler et réinitialiser</button>
|
||||
</form>
|
||||
</div>
|
||||
<?php
|
||||
}
|
||||
|
||||
private function handle_csv_upload(): void {
|
||||
if (empty($_FILES['csv_file']['tmp_name']) || empty($_FILES['ctx_file']['tmp_name'])) {
|
||||
$this->message = ['error', 'CSV ou fichier contexte manquant.'];
|
||||
return;
|
||||
}
|
||||
|
||||
// Parse CSV -> list of hal_ids
|
||||
$fh = fopen($_FILES['csv_file']['tmp_name'], 'r');
|
||||
if (!$fh) { $this->message = ['error', 'Impossible de lire le CSV.']; return; }
|
||||
$header = fgetcsv($fh);
|
||||
$hal_col = array_search('hal_id', $header);
|
||||
$spip_col = array_search('spip_id', $header);
|
||||
if ($hal_col === false) {
|
||||
fclose($fh);
|
||||
$this->message = ['error', 'Header CSV : colonne hal_id manquante.'];
|
||||
return;
|
||||
}
|
||||
$hal_ids = [];
|
||||
$spip_map = []; // hal_id => spip_id
|
||||
while (($row = fgetcsv($fh)) !== false) {
|
||||
$hid = trim($row[$hal_col] ?? '');
|
||||
if ($hid === '') continue;
|
||||
$hal_ids[] = $hid;
|
||||
if ($spip_col !== false) $spip_map[$hid] = trim($row[$spip_col] ?? '');
|
||||
}
|
||||
fclose($fh);
|
||||
$hal_ids = array_values(array_unique($hal_ids));
|
||||
|
||||
// Parse JSON context
|
||||
$ctx_raw = file_get_contents($_FILES['ctx_file']['tmp_name']);
|
||||
$ctx_data = json_decode($ctx_raw, true);
|
||||
if (!is_array($ctx_data) || !isset($ctx_data['ctx'])) {
|
||||
$this->message = ['error', 'Fichier contexte JSON invalide.'];
|
||||
return;
|
||||
}
|
||||
|
||||
$status = ($_POST['post_status'] ?? 'publish') === 'pending' ? 'pending' : 'publish';
|
||||
$backdate = !empty($_POST['backdate_post']);
|
||||
|
||||
$queue = [
|
||||
'hal_ids' => $hal_ids,
|
||||
'spip_map' => $spip_map,
|
||||
'status' => $status,
|
||||
'backdate' => $backdate,
|
||||
'total' => count($hal_ids),
|
||||
'done' => 0,
|
||||
'spip_ctx' => $ctx_data['ctx'],
|
||||
'wp_users_by_hal_id' => $ctx_data['wp_users_by_hal_id'] ?? [],
|
||||
'report' => [],
|
||||
'last_error' => '',
|
||||
'updated_at' => current_time('mysql'),
|
||||
];
|
||||
update_option(self::CSV_QUEUE_OPTION, $queue, false);
|
||||
$this->message = ['success', sprintf(
|
||||
'CSV chargé : %d publications prêtes. Statut cible : %s. Cliquer "Traiter le prochain batch" pour lancer.',
|
||||
count($hal_ids), $status
|
||||
)];
|
||||
}
|
||||
|
||||
private function handle_csv_batch(): void {
|
||||
$queue = get_option(self::CSV_QUEUE_OPTION, null);
|
||||
if (!$queue) { $this->message = ['error', 'Aucune queue active.']; return; }
|
||||
|
||||
$batch = array_slice($queue['hal_ids'], $queue['done'], self::CSV_BATCH_SIZE);
|
||||
if (empty($batch)) {
|
||||
$this->message = ['success', 'Import terminé — tous les batches ont été traités.'];
|
||||
return;
|
||||
}
|
||||
|
||||
$docs = $this->api->fetch_by_hal_ids($batch, self::CSV_BATCH_SIZE);
|
||||
if (is_wp_error($docs)) {
|
||||
$queue['last_error'] = $docs->get_error_message();
|
||||
$queue['updated_at'] = current_time('mysql');
|
||||
update_option(self::CSV_QUEUE_OPTION, $queue, false);
|
||||
$this->message = ['error', 'Erreur HAL API : ' . $docs->get_error_message()];
|
||||
return;
|
||||
}
|
||||
|
||||
// Normalize wp_users_by_hal_id keys to lowercase for the importer
|
||||
$users_map = [];
|
||||
foreach ($queue['wp_users_by_hal_id'] as $hid => $u) {
|
||||
$users_map[strtolower(trim((string) $hid))] = $u;
|
||||
}
|
||||
|
||||
$importer = new Thalim_HAL_Importer_Logic();
|
||||
$batch_imported = 0;
|
||||
$batch_skipped = 0;
|
||||
$batch_errors = 0;
|
||||
|
||||
foreach ($batch as $hal_id) {
|
||||
$spip_id = $queue['spip_map'][$hal_id] ?? '';
|
||||
$doc = $docs[$hal_id] ?? null;
|
||||
$ctx = $queue['spip_ctx'][$hal_id] ?? [];
|
||||
|
||||
if (!$doc) {
|
||||
$queue['report'][] = [$hal_id, $spip_id, '', 'not_found_in_hal', 'false', 'none', 'HAL API did not return this hal_id'];
|
||||
$batch_errors++;
|
||||
continue;
|
||||
}
|
||||
|
||||
$post_id = $importer->import($doc, $users_map, $queue['status'], (bool) $queue['backdate'], $ctx);
|
||||
if (is_wp_error($post_id)) {
|
||||
$code = $post_id->get_error_code();
|
||||
$queue['report'][] = [$hal_id, $spip_id, '', $code, 'false', 'none', $post_id->get_error_message()];
|
||||
if ($code === 'exists') $batch_skipped++;
|
||||
else $batch_errors++;
|
||||
} else {
|
||||
$source = $importer->last_axes_source;
|
||||
$has_axe = $source !== 'none' ? 'true' : 'false';
|
||||
$queue['report'][] = [$hal_id, $spip_id, (string) $post_id, 'imported', $has_axe, $source, ''];
|
||||
$batch_imported++;
|
||||
}
|
||||
}
|
||||
|
||||
$queue['done'] += count($batch);
|
||||
$queue['last_error'] = '';
|
||||
$queue['updated_at'] = current_time('mysql');
|
||||
update_option(self::CSV_QUEUE_OPTION, $queue, false);
|
||||
|
||||
$this->message = ['success', sprintf(
|
||||
'Batch traité : %d importé(s), %d déjà importé(s), %d erreur(s). Progression : %d / %d.',
|
||||
$batch_imported, $batch_skipped, $batch_errors,
|
||||
$queue['done'], $queue['total']
|
||||
)];
|
||||
}
|
||||
|
||||
private function handle_csv_cancel(): void {
|
||||
delete_option(self::CSV_QUEUE_OPTION);
|
||||
$this->message = ['success', 'Queue CSV annulée.'];
|
||||
}
|
||||
|
||||
private function handle_csv_download_report(): void {
|
||||
$queue = get_option(self::CSV_QUEUE_OPTION, null);
|
||||
if (!$queue || empty($queue['report'])) {
|
||||
$this->message = ['warning', 'Aucun rapport à télécharger.'];
|
||||
return;
|
||||
}
|
||||
$filename = 'hal-import-report-' . date('Ymd-His') . '.csv';
|
||||
header('Content-Type: text/csv; charset=utf-8');
|
||||
header('Content-Disposition: attachment; filename="' . $filename . '"');
|
||||
$out = fopen('php://output', 'w');
|
||||
fputcsv($out, ['hal_id', 'spip_id', 'post_id', 'status', 'has_axe', 'axes_source', 'error']);
|
||||
foreach ($queue['report'] as $row) fputcsv($out, $row);
|
||||
fclose($out);
|
||||
exit;
|
||||
}
|
||||
|
||||
// ========================================================================
|
||||
// End CSV bulk import
|
||||
// ========================================================================
|
||||
|
||||
private function get_row_class($doc) {
|
||||
if ($doc['is_imported']) return 'hal-status-imported';
|
||||
if ($doc['has_match']) return 'hal-status-ready';
|
||||
return 'hal-status-blocked';
|
||||
}
|
||||
|
||||
private function get_status_icon($doc) {
|
||||
if ($doc['is_imported']) return '<span title="Already imported" style="color:#28a745;font-size:18px">✓</span>';
|
||||
if ($doc['has_match']) return '<span title="Ready to import" style="color:#ffc107;font-size:18px">★</span>';
|
||||
return '<span title="No matched user" style="color:#dc3545;font-size:18px">✗</span>';
|
||||
}
|
||||
}
|
||||
137
includes/class-hal-api.php
Normal file
137
includes/class-hal-api.php
Normal file
@@ -0,0 +1,137 @@
|
||||
<?php
|
||||
/**
|
||||
* HAL API Class - Handles communication with HAL API
|
||||
*/
|
||||
|
||||
if (!defined('ABSPATH')) {
|
||||
exit;
|
||||
}
|
||||
|
||||
class Thalim_HAL_API {
|
||||
|
||||
// Fields to retrieve from HAL (expanded for import preview)
|
||||
private const FIELDS = 'halId_s,title_s,docType_s,authFullName_s,authIdHal_s,publicationDate_s,producedDate_s,submittedDate_s,journalTitle_s,bookTitle_s,uri_s,fileMain_s,abstract_s,keyword_s,publisher_s,modifiedDate_s,doiId_s,citationFull_s,conferenceTitle_s,city_s,country_s,defenseDate_s';
|
||||
|
||||
/**
|
||||
* Fetch publications from THALIM structure
|
||||
*
|
||||
* @param int $rows Number of results to fetch
|
||||
* @param int $start Offset
|
||||
* @param string $sort Solr sort expression
|
||||
* @param string $date_from Filter start date (YYYY-MM-DD), empty = no lower bound
|
||||
* @param string $date_to Filter end date (YYYY-MM-DD), empty = no upper bound
|
||||
*/
|
||||
public function fetch_publications($rows = 100, $start = 0, $sort = 'producedDate_tdate desc', $date_from = '', $date_to = '', $author_hal_id = '') {
|
||||
$url = $this->build_url($rows, $start, $sort, $date_from, $date_to, $author_hal_id);
|
||||
return $this->request($url);
|
||||
}
|
||||
|
||||
/**
|
||||
* Fetch full HAL docs by a list of hal_ids (batched).
|
||||
* Uses Solr fq=halId_s:(id1 OR id2 OR ...) syntax. No structId filter —
|
||||
* fetch by halId exact, regardless of structure.
|
||||
*
|
||||
* @param string[] $hal_ids HAL IDs to fetch.
|
||||
* @param int $batch Batch size (default 100).
|
||||
* @return array|WP_Error Array keyed by halId_s, or WP_Error on failure.
|
||||
*/
|
||||
public function fetch_by_hal_ids(array $hal_ids, int $batch = 100) {
|
||||
$docs = [];
|
||||
$chunks = array_chunk(array_values(array_unique($hal_ids)), $batch);
|
||||
foreach ($chunks as $chunk) {
|
||||
$filter = 'halId_s:(' . implode(' OR ', $chunk) . ')';
|
||||
$params = [
|
||||
'q=' . urlencode('*:*'),
|
||||
'fq=' . urlencode($filter),
|
||||
'rows=' . count($chunk),
|
||||
'fl=' . urlencode(self::FIELDS),
|
||||
'wt=json',
|
||||
];
|
||||
$url = THALIM_HAL_API_BASE . '?' . implode('&', $params);
|
||||
$data = $this->request($url);
|
||||
if (is_wp_error($data)) return $data;
|
||||
foreach ($data['response']['docs'] ?? [] as $doc) {
|
||||
if (!empty($doc['halId_s'])) {
|
||||
$docs[$doc['halId_s']] = $doc;
|
||||
}
|
||||
}
|
||||
// Be polite with HAL if we have multiple chunks
|
||||
if (count($chunks) > 1) usleep(250000);
|
||||
}
|
||||
return $docs;
|
||||
}
|
||||
|
||||
/**
|
||||
* Test API connection
|
||||
*/
|
||||
public function test_connection() {
|
||||
$result = $this->fetch_publications(5);
|
||||
if (is_wp_error($result)) return $result;
|
||||
return [
|
||||
'success' => true,
|
||||
'total' => $result['response']['numFound'] ?? 0,
|
||||
'sample' => $result['response']['docs'] ?? []
|
||||
];
|
||||
}
|
||||
|
||||
/**
|
||||
* Build API URL with proper fq parameter handling
|
||||
*
|
||||
* @param int $rows
|
||||
* @param int $start
|
||||
* @param string $sort
|
||||
* @param string $date_from YYYY-MM-DD or empty
|
||||
* @param string $date_to YYYY-MM-DD or empty
|
||||
*/
|
||||
private function build_url($rows = 5, $start = 0, $sort = 'modifiedDate_tdate desc', $date_from = '', $date_to = '', $author_hal_id = '') {
|
||||
$doc_types = implode(' OR ', THALIM_HAL_DOC_TYPES);
|
||||
|
||||
$from = $date_from ? $date_from . 'T00:00:00Z' : '*';
|
||||
$to = $date_to ? $date_to . 'T23:59:59Z' : '*';
|
||||
|
||||
$params = [
|
||||
'q=' . urlencode('*:*'),
|
||||
'fq=' . urlencode('structId_i:' . THALIM_HAL_STRUCT_ID),
|
||||
'fq=' . urlencode('docType_s:(' . $doc_types . ')'),
|
||||
];
|
||||
|
||||
if ($from !== '*' || $to !== '*') {
|
||||
$params[] = 'fq=' . urlencode('producedDate_tdate:[' . $from . ' TO ' . $to . ']');
|
||||
}
|
||||
|
||||
if ($author_hal_id !== '') {
|
||||
$params[] = 'fq=' . urlencode('authIdHal_s:' . $author_hal_id);
|
||||
}
|
||||
|
||||
$params = array_merge($params, [
|
||||
'rows=' . intval($rows),
|
||||
'start=' . intval($start),
|
||||
'sort=' . urlencode($sort),
|
||||
'fl=' . urlencode(self::FIELDS),
|
||||
'wt=json'
|
||||
]);
|
||||
|
||||
return THALIM_HAL_API_BASE . '?' . implode('&', $params);
|
||||
}
|
||||
|
||||
/**
|
||||
* Get API URL for debugging display
|
||||
*/
|
||||
public function get_api_url($rows = 5) {
|
||||
return $this->build_url($rows, 0);
|
||||
}
|
||||
|
||||
/**
|
||||
* Make HTTP request
|
||||
*/
|
||||
private function request($url) {
|
||||
$response = wp_remote_get($url, ['timeout' => 30, 'headers' => ['Accept' => 'application/json']]);
|
||||
if (is_wp_error($response)) return $response;
|
||||
|
||||
$code = wp_remote_retrieve_response_code($response);
|
||||
if ($code !== 200) return new WP_Error('api_error', "HTTP $code");
|
||||
|
||||
$data = json_decode(wp_remote_retrieve_body($response), true);
|
||||
return json_last_error() === JSON_ERROR_NONE ? $data : new WP_Error('json_error', 'Invalid JSON');
|
||||
}
|
||||
}
|
||||
523
includes/class-importer.php
Normal file
523
includes/class-importer.php
Normal file
@@ -0,0 +1,523 @@
|
||||
<?php
|
||||
/**
|
||||
* Importer Logic Class - Handles import from HAL to WordPress
|
||||
*/
|
||||
|
||||
if (!defined('ABSPATH')) {
|
||||
exit;
|
||||
}
|
||||
|
||||
class Thalim_HAL_Importer_Logic {
|
||||
|
||||
// HAL doc type -> WordPress category ID
|
||||
private const DOC_TYPE_MAP = [
|
||||
'ART' => 16, // Article
|
||||
'COUV' => 16, // Chapitre -> Articles
|
||||
'OUV' => 15, // Ouvrage -> Ouvrages
|
||||
'COMM' => 13, // Communication -> Communications
|
||||
'ISSUE' => 16, // Direction de numéro -> Articles
|
||||
'PROCEEDINGS' => 15, // Direction d'ouvrage/Proceedings -> Ouvrages
|
||||
'THESE' => 14, // Thèse -> Soutenances
|
||||
'HDR' => 14, // HDR -> Soutenances
|
||||
'SON' => 19, // Son -> Captations audio/vidéo
|
||||
'VIDEO' => 19, // Vidéo -> Captations audio/vidéo
|
||||
];
|
||||
|
||||
// Doc types that use date_de_debut instead of datetime
|
||||
private const EVENT_DOC_TYPES = ['COMM', 'THESE', 'HDR', 'SON', 'VIDEO'];
|
||||
|
||||
// Pods IDs — queried from the DB, stable per installation
|
||||
private const POD_ID_POST = 8;
|
||||
private const FIELD_ID_CATEGORIE = 16; // "Type d'annonce" (picks from WP category)
|
||||
private const FIELD_ID_MEMBRES = 178;
|
||||
private const FIELD_ID_AUTRE_MBRES = 195; // autre_membres (unused in import, for reference)
|
||||
private const FIELD_ID_AXES = 270; // axes_thematiques (picks from axe_thematique)
|
||||
private const FIELD_ID_PROGRAMMES = 271; // programmes_de_recherche (picks from programme_de_recherche)
|
||||
private const FIELD_ID_ETIQUETTES = 652; // étiquettes (picks from post_tag)
|
||||
|
||||
/** Source of the axes applied on the last import(): 'spip' | 'coauthors' | 'owner' | 'none'. */
|
||||
public $last_axes_source = 'none';
|
||||
|
||||
/**
|
||||
* Check if publication is already imported
|
||||
*/
|
||||
public function is_imported($hal_id) {
|
||||
if (empty($hal_id)) return false;
|
||||
global $wpdb;
|
||||
return (int) $wpdb->get_var($wpdb->prepare(
|
||||
"SELECT COUNT(*) FROM {$wpdb->postmeta} WHERE meta_key = 'hal_id' AND meta_value = %s",
|
||||
$hal_id
|
||||
)) > 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get category ID for HAL doc type
|
||||
*/
|
||||
public function get_category_id($doc_type) {
|
||||
return self::DOC_TYPE_MAP[$doc_type] ?? null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get doc type mappings
|
||||
*/
|
||||
public function get_doc_type_map() {
|
||||
return self::DOC_TYPE_MAP;
|
||||
}
|
||||
|
||||
/**
|
||||
* Import a HAL publication as a WordPress post.
|
||||
*
|
||||
* @param array $hal_doc Raw HAL API document.
|
||||
* @param array $wp_users_by_hal_id Map of normalized_hal_id => ['id' => int, 'name' => string].
|
||||
* @param string $post_status Target post_status (default 'pending').
|
||||
* @param bool $backdate_post Use producedDate_s as post_date (default false).
|
||||
* @param array $spip_context SPIP-derived context for bulk imports:
|
||||
* ['axes' => int[], 'tags' => int[], 'programmes' => int[], 'owner_user_id' => ?int]
|
||||
* @return int|WP_Error New post ID on success, WP_Error on failure.
|
||||
* The axes source is stored in $this->last_axes_source for caller reporting.
|
||||
*/
|
||||
public function import(
|
||||
array $hal_doc,
|
||||
array $wp_users_by_hal_id = [],
|
||||
string $post_status = 'pending',
|
||||
bool $backdate_post = false,
|
||||
array $spip_context = []
|
||||
) {
|
||||
$hal_id = $hal_doc['halId_s'] ?? '';
|
||||
$doc_type = $hal_doc['docType_s'] ?? '';
|
||||
|
||||
if (empty($hal_id)) return new WP_Error('no_id', 'Missing HAL ID');
|
||||
if ($this->is_imported($hal_id)) return new WP_Error('exists', 'Already imported: ' . $hal_id);
|
||||
|
||||
// --- Resolve post author from HAL author IDs ---
|
||||
$author_hal_ids = $hal_doc['authIdHal_s'] ?? [];
|
||||
$author_names = $hal_doc['authFullName_s'] ?? [];
|
||||
$matched_user_ids = [];
|
||||
$matched_user_names = [];
|
||||
foreach ($author_hal_ids as $hal_author_id) {
|
||||
$normalized = strtolower(trim($hal_author_id));
|
||||
if (isset($wp_users_by_hal_id[$normalized])) {
|
||||
$user = $wp_users_by_hal_id[$normalized];
|
||||
$matched_user_ids[] = $user['id'];
|
||||
$matched_user_names[] = $user['name'];
|
||||
}
|
||||
}
|
||||
$post_author = !empty($matched_user_ids) ? $matched_user_ids[0] : 1;
|
||||
|
||||
// --- Create the post ---
|
||||
$post_args = [
|
||||
'post_title' => wp_strip_all_tags($hal_doc['title_s'][0] ?? ''),
|
||||
'post_content' => wp_kses_post($hal_doc['abstract_s'][0] ?? ''),
|
||||
'post_status' => $post_status,
|
||||
'post_type' => 'post',
|
||||
'post_author' => $post_author,
|
||||
];
|
||||
|
||||
// Backdate post_date to HAL producedDate_s when requested (for legacy bulk imports)
|
||||
if ($backdate_post) {
|
||||
$backdate_ymd = $this->parse_hal_date($hal_doc['producedDate_s'] ?? '');
|
||||
if ($backdate_ymd) {
|
||||
$post_args['post_date'] = $backdate_ymd . ' 12:00:00';
|
||||
$post_args['post_date_gmt'] = $backdate_ymd . ' 12:00:00';
|
||||
}
|
||||
}
|
||||
|
||||
$post_id = wp_insert_post($post_args, true);
|
||||
if (is_wp_error($post_id)) return $post_id;
|
||||
|
||||
// --- Category — Pods triple-storage pattern ---
|
||||
$cat_id = self::DOC_TYPE_MAP[$doc_type] ?? null;
|
||||
if ($cat_id) {
|
||||
global $wpdb;
|
||||
|
||||
// 1. Native WP category assignment
|
||||
wp_set_post_categories($post_id, [$cat_id]);
|
||||
|
||||
// 2. Pods postmeta: single integer value
|
||||
update_post_meta($post_id, 'categorie', $cat_id);
|
||||
|
||||
// 3. Pods _pods_ meta: serialized array of one integer
|
||||
update_post_meta($post_id, '_pods_categorie', [$cat_id]);
|
||||
|
||||
// 4. wp_podsrel row
|
||||
$wpdb->insert(
|
||||
$wpdb->prefix . 'podsrel',
|
||||
[
|
||||
'pod_id' => self::POD_ID_POST,
|
||||
'field_id' => self::FIELD_ID_CATEGORIE,
|
||||
'item_id' => $post_id,
|
||||
'related_pod_id' => 0,
|
||||
'related_field_id'=> 0,
|
||||
'related_item_id' => $cat_id,
|
||||
'weight' => 0,
|
||||
],
|
||||
['%d', '%d', '%d', '%d', '%d', '%d', '%d']
|
||||
);
|
||||
}
|
||||
|
||||
// --- Core meta ---
|
||||
update_post_meta($post_id, 'hal_id', $hal_id);
|
||||
update_post_meta($post_id, 'hal_url', $hal_doc['uri_s'] ?? '');
|
||||
|
||||
// HAL PDF file -> lien_externe_1
|
||||
$file_url = $hal_doc['fileMain_s'] ?? '';
|
||||
if ($file_url) {
|
||||
update_post_meta($post_id, 'lien_externe_1', $file_url);
|
||||
update_post_meta($post_id, 'titre_du_lien_externe_1', 'Document HAL // HAL Document');
|
||||
}
|
||||
|
||||
// Journal (ART)
|
||||
$journal = $hal_doc['journalTitle_s'] ?? '';
|
||||
if ($journal) {
|
||||
update_post_meta($post_id, 'journal', $journal);
|
||||
}
|
||||
|
||||
// Book title as sous-titre (COUV), only if different from post title
|
||||
$book_title = $hal_doc['bookTitle_s'] ?? '';
|
||||
$post_title = $hal_doc['title_s'][0] ?? '';
|
||||
if ($book_title && $book_title !== $post_title) {
|
||||
update_post_meta($post_id, 'sous-titre', $book_title);
|
||||
}
|
||||
|
||||
// Publisher -> editeur (plain text, no Pods triple-storage needed)
|
||||
$publisher = $hal_doc['publisher_s'] ?? '';
|
||||
if (is_array($publisher)) $publisher = $publisher[0] ?? '';
|
||||
if ($publisher) {
|
||||
update_post_meta($post_id, 'editeur', $publisher);
|
||||
}
|
||||
|
||||
// Fonction label: bilingual plain text (only relevant for cats 4, 15, 16)
|
||||
if ($doc_type === 'COUV') {
|
||||
update_post_meta($post_id, 'fonction_auteur', 'Auteur du chapitre // Chapter author');
|
||||
} elseif ($doc_type === 'ISSUE') {
|
||||
update_post_meta($post_id, 'fonction_auteur', 'Direction de numéro // Editor-in-Chief');
|
||||
}
|
||||
|
||||
// --- Keywords -> étiquettes (Pods triple-storage, picks from post_tag) ---
|
||||
$hal_keywords = $hal_doc['keyword_s'] ?? [];
|
||||
if (!empty($hal_keywords)) {
|
||||
$matched_term_ids = $this->match_keywords_to_tags($hal_keywords);
|
||||
if (!empty($matched_term_ids)) {
|
||||
global $wpdb;
|
||||
|
||||
// 1. Native WP term relationship
|
||||
wp_set_object_terms($post_id, $matched_term_ids, 'post_tag', true);
|
||||
|
||||
// 2. Individual postmeta rows (one per term ID)
|
||||
foreach ($matched_term_ids as $tid) {
|
||||
add_post_meta($post_id, 'etiquettes', (string) $tid);
|
||||
}
|
||||
|
||||
// 3. _pods_etiquettes: serialized array of term IDs as integers
|
||||
update_post_meta($post_id, '_pods_etiquettes', array_map('intval', $matched_term_ids));
|
||||
|
||||
// 4. wp_podsrel rows
|
||||
foreach ($matched_term_ids as $weight => $tid) {
|
||||
$wpdb->insert(
|
||||
$wpdb->prefix . 'podsrel',
|
||||
[
|
||||
'pod_id' => self::POD_ID_POST,
|
||||
'field_id' => self::FIELD_ID_ETIQUETTES,
|
||||
'item_id' => $post_id,
|
||||
'related_pod_id' => 0,
|
||||
'related_field_id' => 0,
|
||||
'related_item_id' => (int) $tid,
|
||||
'weight' => $weight,
|
||||
],
|
||||
['%d', '%d', '%d', '%d', '%d', '%d', '%d']
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// --- Date meta ---
|
||||
$date_raw = $hal_doc['producedDate_s'] ?? '';
|
||||
// THESE/HDR: use defenseDate_s if available, fallback to producedDate_s
|
||||
if (in_array($doc_type, ['THESE', 'HDR'])) {
|
||||
$defense = $hal_doc['defenseDate_s'] ?? '';
|
||||
if ($defense) $date_raw = $defense;
|
||||
}
|
||||
$date_meta = $this->parse_hal_date($date_raw);
|
||||
if ($date_meta) {
|
||||
$date_field = in_array($doc_type, self::EVENT_DOC_TYPES) ? 'date_de_debut' : 'datetime';
|
||||
update_post_meta($post_id, $date_field, $date_meta);
|
||||
}
|
||||
|
||||
// --- Type pick fields (pick custom-simple — no triple-storage) ---
|
||||
$type_picks = [
|
||||
'PROCEEDINGS' => ['type_colloque', 'Colloque'],
|
||||
'THESE' => ['type_soutenance', 'Soutenance de thèse'],
|
||||
'HDR' => ['type_soutenance', "Soutenance d'habilitation"],
|
||||
'SON' => ['type_captation', 'Son'],
|
||||
'VIDEO' => ['type_captation', 'Vidéo'],
|
||||
];
|
||||
if (isset($type_picks[$doc_type])) {
|
||||
[$field, $value] = $type_picks[$doc_type];
|
||||
update_post_meta($post_id, $field, $value);
|
||||
}
|
||||
|
||||
// --- Lieu for PROCEEDINGS (city, country from HAL) ---
|
||||
if ($doc_type === 'PROCEEDINGS') {
|
||||
$city = $hal_doc['city_s'] ?? '';
|
||||
$country = $hal_doc['country_s'] ?? '';
|
||||
if (is_array($city)) $city = $city[0] ?? '';
|
||||
if (is_array($country)) $country = $country[0] ?? '';
|
||||
$lieu = trim("$city, $country", ', ');
|
||||
if ($lieu) {
|
||||
update_post_meta($post_id, 'lieu', $lieu);
|
||||
}
|
||||
}
|
||||
|
||||
// --- Conference title as sous-titre for PROCEEDINGS ---
|
||||
if ($doc_type === 'PROCEEDINGS') {
|
||||
$conf_title = $hal_doc['conferenceTitle_s'] ?? '';
|
||||
if ($conf_title) {
|
||||
update_post_meta($post_id, 'sous-titre', $conf_title);
|
||||
}
|
||||
}
|
||||
|
||||
// --- Reference bibliographique from citationFull_s (cats 4, 15, 16) ---
|
||||
$citation = $hal_doc['citationFull_s'] ?? '';
|
||||
if ($citation && in_array($cat_id, [4, 15, 16])) {
|
||||
update_post_meta($post_id, 'reference_bibliographique', wp_kses_post($citation));
|
||||
}
|
||||
|
||||
// --- Store matched THALIM members — Pods triple-storage pattern
|
||||
if (!empty($matched_user_ids)) {
|
||||
global $wpdb;
|
||||
|
||||
// 1. Individual postmeta rows (one per user ID, as string)
|
||||
foreach ($matched_user_ids as $uid) {
|
||||
add_post_meta($post_id, 'membres', (string) $uid);
|
||||
}
|
||||
|
||||
// 2. _pods_ meta: serialized PHP array of user IDs as integers
|
||||
update_post_meta($post_id, '_pods_membres', array_map('intval', $matched_user_ids));
|
||||
|
||||
// 3. wp_podsrel rows (one per user, weight = position)
|
||||
foreach ($matched_user_ids as $weight => $uid) {
|
||||
$wpdb->insert(
|
||||
$wpdb->prefix . 'podsrel',
|
||||
[
|
||||
'pod_id' => self::POD_ID_POST,
|
||||
'field_id' => self::FIELD_ID_MEMBRES,
|
||||
'item_id' => $post_id,
|
||||
'related_pod_id' => 0,
|
||||
'related_field_id'=> 0,
|
||||
'related_item_id' => (int) $uid,
|
||||
'weight' => $weight,
|
||||
],
|
||||
['%d', '%d', '%d', '%d', '%d', '%d', '%d']
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
// --- Axes thématiques : cascade (SPIP direct > co-auteurs > owner) ---
|
||||
$axes_resolution = $this->resolve_axes_cascade($matched_user_ids, $spip_context);
|
||||
$this->last_axes_source = $axes_resolution['source'];
|
||||
if (!empty($axes_resolution['term_ids'])) {
|
||||
$this->set_pods_taxonomy_multi(
|
||||
$post_id, 'axes_thematiques', self::FIELD_ID_AXES,
|
||||
$axes_resolution['term_ids'], 'axe_thematique'
|
||||
);
|
||||
}
|
||||
|
||||
// --- Programmes de recherche : SPIP direct OR keyword matching ---
|
||||
$prog_ids = !empty($spip_context['programmes'])
|
||||
? array_map('intval', $spip_context['programmes'])
|
||||
: $this->match_terms_by_keywords($hal_doc['keyword_s'] ?? [], 'programme_de_recherche');
|
||||
if (!empty($prog_ids)) {
|
||||
$this->set_pods_taxonomy_multi(
|
||||
$post_id, 'programmes_de_recherche', self::FIELD_ID_PROGRAMMES,
|
||||
$prog_ids, 'programme_de_recherche'
|
||||
);
|
||||
}
|
||||
|
||||
// --- Étiquettes SPIP directes (en plus du matching HAL déjà fait plus haut) ---
|
||||
if (!empty($spip_context['tags'])) {
|
||||
// Merge avec les tags déjà posés par le bloc étiquettes plus haut
|
||||
$existing = wp_get_object_terms($post_id, 'post_tag', ['fields' => 'ids']);
|
||||
$merged = array_values(array_unique(array_merge(
|
||||
is_array($existing) ? array_map('intval', $existing) : [],
|
||||
array_map('intval', $spip_context['tags'])
|
||||
)));
|
||||
$this->set_pods_taxonomy_multi(
|
||||
$post_id, 'etiquettes', self::FIELD_ID_ETIQUETTES,
|
||||
array_diff($merged, is_array($existing) ? $existing : []),
|
||||
'post_tag'
|
||||
);
|
||||
}
|
||||
|
||||
// Unmatched authors as free text — remove matched names from the full list
|
||||
$unmatched = array_filter($author_names, function($name) use ($matched_user_names) {
|
||||
foreach ($matched_user_names as $matched) {
|
||||
// Loose comparison: ignore case and extra spaces
|
||||
if (mb_strtolower(trim($name)) === mb_strtolower(trim($matched))) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
return true;
|
||||
});
|
||||
if (!empty($unmatched)) {
|
||||
update_post_meta($post_id, 'autrepersonnes', implode(', ', array_values($unmatched)));
|
||||
}
|
||||
|
||||
// --- Polylang: assign French language ---
|
||||
if (function_exists('pll_set_post_language')) {
|
||||
pll_set_post_language($post_id, 'fr');
|
||||
}
|
||||
|
||||
return $post_id;
|
||||
}
|
||||
|
||||
/**
|
||||
* Match HAL keyword strings against existing WordPress terms in a given taxonomy.
|
||||
*
|
||||
* WP terms are often stored bilingually as "Terme FR // English term".
|
||||
* Matching is case-insensitive against both the FR and EN parts.
|
||||
*
|
||||
* @param string[] $hal_keywords Raw keyword strings from HAL keyword_s field.
|
||||
* @param string $taxonomy WordPress taxonomy (e.g. 'post_tag', 'programme_de_recherche').
|
||||
* @return int[] Matched term IDs.
|
||||
*/
|
||||
private function match_terms_by_keywords(array $hal_keywords, string $taxonomy = 'post_tag'): array {
|
||||
if (empty($hal_keywords)) return [];
|
||||
|
||||
$terms = get_terms(['taxonomy' => $taxonomy, 'hide_empty' => false]);
|
||||
if (is_wp_error($terms) || empty($terms)) return [];
|
||||
|
||||
// Normalise HAL keywords once for comparison
|
||||
$hal_lower = array_map(fn($kw) => mb_strtolower(trim($kw)), $hal_keywords);
|
||||
|
||||
$matched = [];
|
||||
foreach ($terms as $term) {
|
||||
$parts = explode(' // ', $term->name, 2);
|
||||
$fr = mb_strtolower(trim($parts[0]));
|
||||
$en = isset($parts[1]) ? mb_strtolower(trim($parts[1])) : null;
|
||||
|
||||
if (in_array($fr, $hal_lower, true) ||
|
||||
($en !== null && in_array($en, $hal_lower, true))) {
|
||||
$matched[] = (int) $term->term_id;
|
||||
}
|
||||
}
|
||||
|
||||
return $matched;
|
||||
}
|
||||
|
||||
/**
|
||||
* Backwards-compatible alias for the renamed method.
|
||||
*/
|
||||
private function match_keywords_to_tags(array $hal_keywords): array {
|
||||
return $this->match_terms_by_keywords($hal_keywords, 'post_tag');
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolve axes thématiques through a cascade of strategies.
|
||||
*
|
||||
* 1. Direct SPIP links ($spip_context['axes'])
|
||||
* 2. Axes from all matched WP co-authors (_pods_axes_thematiques)
|
||||
* 3. Axe of the SPIP flux owner user ($spip_context['owner_user_id'])
|
||||
*
|
||||
* @return array{source: string, term_ids: int[]}
|
||||
*/
|
||||
private function resolve_axes_cascade(array $matched_user_ids, array $spip_context): array {
|
||||
// 1. SPIP direct
|
||||
if (!empty($spip_context['axes'])) {
|
||||
$ids = array_values(array_unique(array_map('intval', $spip_context['axes'])));
|
||||
if (!empty($ids)) return ['source' => 'spip', 'term_ids' => $ids];
|
||||
}
|
||||
|
||||
// 2. Co-authors matched (any matched THALIM member with an axe)
|
||||
$from_authors = [];
|
||||
foreach ($matched_user_ids as $uid) {
|
||||
$axes = get_user_meta((int) $uid, '_pods_axes_thematiques', true);
|
||||
if (is_array($axes)) {
|
||||
foreach ($axes as $tid) $from_authors[] = (int) $tid;
|
||||
} elseif (is_string($axes) && $axes !== '') {
|
||||
$unser = @unserialize($axes);
|
||||
if (is_array($unser)) {
|
||||
foreach ($unser as $tid) $from_authors[] = (int) $tid;
|
||||
}
|
||||
}
|
||||
}
|
||||
$from_authors = array_values(array_unique(array_filter($from_authors)));
|
||||
if (!empty($from_authors)) {
|
||||
return ['source' => 'coauthors', 'term_ids' => $from_authors];
|
||||
}
|
||||
|
||||
// 3. SPIP flux owner user
|
||||
if (!empty($spip_context['owner_user_id'])) {
|
||||
$axes = get_user_meta((int) $spip_context['owner_user_id'], '_pods_axes_thematiques', true);
|
||||
$ids = [];
|
||||
if (is_array($axes)) {
|
||||
$ids = array_map('intval', $axes);
|
||||
} elseif (is_string($axes) && $axes !== '') {
|
||||
$unser = @unserialize($axes);
|
||||
if (is_array($unser)) $ids = array_map('intval', $unser);
|
||||
}
|
||||
$ids = array_values(array_unique(array_filter($ids)));
|
||||
if (!empty($ids)) return ['source' => 'owner', 'term_ids' => $ids];
|
||||
}
|
||||
|
||||
return ['source' => 'none', 'term_ids' => []];
|
||||
}
|
||||
|
||||
/**
|
||||
* Parse a HAL date (YYYY, YYYY-MM, YYYY-MM-DD, or ISO datetime) to Y-m-d.
|
||||
* Returns '' on failure. HAL often emits partial dates that strtotime
|
||||
* mishandles (e.g. strtotime("2022") interprets 2022 as a time, not a year).
|
||||
*/
|
||||
private function parse_hal_date(string $raw): string {
|
||||
$raw = trim($raw);
|
||||
if ($raw === '') return '';
|
||||
if (preg_match('/^(\d{4})-(\d{2})-(\d{2})/', $raw, $m)) {
|
||||
return "{$m[1]}-{$m[2]}-{$m[3]}";
|
||||
}
|
||||
if (preg_match('/^(\d{4})-(\d{2})$/', $raw, $m)) {
|
||||
return "{$m[1]}-{$m[2]}-01";
|
||||
}
|
||||
if (preg_match('/^(\d{4})$/', $raw, $m)) {
|
||||
return "{$m[1]}-01-01";
|
||||
}
|
||||
$ts = strtotime($raw);
|
||||
return $ts ? date('Y-m-d', $ts) : '';
|
||||
}
|
||||
|
||||
/**
|
||||
* Generic Pods triple-storage writer for multi-value taxonomy fields.
|
||||
* Writes to: wp_term_relationships, postmeta rows, _pods_ meta, wp_podsrel.
|
||||
*/
|
||||
private function set_pods_taxonomy_multi(int $post_id, string $field_name, int $field_id, array $term_ids, string $taxonomy): void {
|
||||
if (empty($term_ids)) return;
|
||||
global $wpdb;
|
||||
$term_ids = array_values(array_unique(array_map('intval', $term_ids)));
|
||||
|
||||
// 1. wp_term_relationships
|
||||
wp_set_object_terms($post_id, $term_ids, $taxonomy, true);
|
||||
|
||||
// 2. postmeta (one row per term ID, as string)
|
||||
foreach ($term_ids as $tid) {
|
||||
add_post_meta($post_id, $field_name, (string) $tid);
|
||||
}
|
||||
|
||||
// 3. _pods_ meta: serialized array of ints
|
||||
update_post_meta($post_id, '_pods_' . $field_name, $term_ids);
|
||||
|
||||
// 4. wp_podsrel rows (weight = position)
|
||||
foreach ($term_ids as $weight => $tid) {
|
||||
$wpdb->insert(
|
||||
$wpdb->prefix . 'podsrel',
|
||||
[
|
||||
'pod_id' => self::POD_ID_POST,
|
||||
'field_id' => $field_id,
|
||||
'item_id' => $post_id,
|
||||
'related_pod_id' => 0,
|
||||
'related_field_id' => 0,
|
||||
'related_item_id' => (int) $tid,
|
||||
'weight' => $weight,
|
||||
],
|
||||
['%d', '%d', '%d', '%d', '%d', '%d', '%d']
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
86
thalim-hal-importer.php
Normal file
86
thalim-hal-importer.php
Normal file
@@ -0,0 +1,86 @@
|
||||
<?php
|
||||
/**
|
||||
* Plugin Name: THALIM HAL Importer
|
||||
* Plugin URI: https://thalim.fr
|
||||
* Description: Import publications from HAL for the THALIM lab (structure_s:THALIM)
|
||||
* Version: 2.0.0
|
||||
* Author: THALIM Dev
|
||||
* Author URI: https://thalim.fr
|
||||
* License: GPL v2 or later
|
||||
* Text Domain: thalim-hal-importer
|
||||
*/
|
||||
|
||||
// Prevent direct access
|
||||
if (!defined('ABSPATH')) {
|
||||
exit;
|
||||
}
|
||||
|
||||
// Plugin constants
|
||||
define('THALIM_HAL_VERSION', '2.0.0');
|
||||
define('THALIM_HAL_PLUGIN_DIR', plugin_dir_path(__FILE__));
|
||||
define('THALIM_HAL_PLUGIN_URL', plugin_dir_url(__FILE__));
|
||||
|
||||
// HAL API configuration
|
||||
define('THALIM_HAL_API_BASE', 'https://api.archives-ouvertes.fr/search/');
|
||||
define('THALIM_HAL_STRUCT_ID', 254015); // THALIM lab structure ID
|
||||
define('THALIM_HAL_DOC_TYPES', ['ART', 'COUV', 'OUV', 'COMM', 'ISSUE', 'PROCEEDINGS', 'THESE', 'HDR', 'SON', 'VIDEO']);
|
||||
|
||||
/**
|
||||
* Main plugin class - minimal bootstrap
|
||||
*/
|
||||
class Thalim_HAL_Importer {
|
||||
|
||||
private static $instance = null;
|
||||
|
||||
public static function get_instance() {
|
||||
if (self::$instance === null) {
|
||||
self::$instance = new self();
|
||||
}
|
||||
return self::$instance;
|
||||
}
|
||||
|
||||
private function __construct() {
|
||||
$this->load_dependencies();
|
||||
$this->init_hooks();
|
||||
}
|
||||
|
||||
private function load_dependencies() {
|
||||
require_once THALIM_HAL_PLUGIN_DIR . 'includes/class-hal-api.php';
|
||||
require_once THALIM_HAL_PLUGIN_DIR . 'includes/class-admin-page.php';
|
||||
require_once THALIM_HAL_PLUGIN_DIR . 'includes/class-importer.php';
|
||||
}
|
||||
|
||||
private function init_hooks() {
|
||||
add_action('admin_menu', [$this, 'add_admin_menu']);
|
||||
}
|
||||
|
||||
public function add_admin_menu() {
|
||||
add_management_page(
|
||||
'HAL Import',
|
||||
'HAL Import',
|
||||
'edit_others_posts',
|
||||
'thalim-hal-importer',
|
||||
[$this, 'render_admin_page']
|
||||
);
|
||||
}
|
||||
|
||||
public function render_admin_page() {
|
||||
$admin_page = new Thalim_HAL_Admin_Page();
|
||||
$admin_page->render();
|
||||
}
|
||||
}
|
||||
|
||||
// Activation hook
|
||||
register_activation_hook(__FILE__, function() {
|
||||
add_option('thalim_hal_version', THALIM_HAL_VERSION);
|
||||
});
|
||||
|
||||
// Deactivation hook
|
||||
register_deactivation_hook(__FILE__, function() {
|
||||
delete_transient('thalim_hal_preview_data');
|
||||
});
|
||||
|
||||
// Initialize plugin
|
||||
add_action('plugins_loaded', function() {
|
||||
Thalim_HAL_Importer::get_instance();
|
||||
});
|
||||
Reference in New Issue
Block a user