
If a few days ago we saw how to split long WordPress posts into pages without affecting SEO, today we will add an extra function that allows you to easily find and categorise those long posts according to content type, length and language.
When you have only a few long entries it is easy to have them almost by heart. It gets more complicated when there are a lot of them and they also have their respective translations in other languages. This code aims to speed up the selection process when splitting posts (with the Gutenberg "Page Break" or "Page Break" block) and to let you know quickly which ones you still have to split and if your current splits are coherent.
What exactly does the code do?
The following code added to the functions.php of your template or child theme will generate a shortcode, using this shortcode on any page or post will display a paginated list (10 entries per page) with all posts that have more than 1500 words including those translated into other languages. It works both with Polylang and without Polylang.
These values can be changed in // 2. Configuration (Just don't increase the total number of pages to display too much if you don't want the query to slow down the response). The idea is that you use the shortcode on draft or private pages so you don't risk a hit of visitors affecting performance and eating up your server's CPU if you have limited resources).
This code will allow you to see not only how many breaks a post has (or if it has none at all), but also how the words are distributed between the different sections by showing how many words each break contains, which can be very useful to evaluate if your current pagination is balanced.
Automatic detection of content types
Automatic detection is added for 4 content types, where "break" means a break every x words. The calculation is based on information density and the maximum limit is configurable.
Technical (700 words/break): Tutorials, code
Narrative (1200 words/break): Stories, narratives
Journalistic (900 words/break): News, articles
General (1000 words/break): Standard content
Selection criteria
For the code criteria I have searched for tips and references. Some parameters I have adjusted to my blog, but this is always elastic and can be fine-tuned for each site, you can always add some more signals to make the detection more accurate. Take it as a starting point and examine what elements you use most often in your articles according to their type or add your own brands that the code can identify and some of the tags you use most often.
The current criteria for screening are as follows:
A. Technical Content (⚙️ 700 words/break)
- Categories: tutorial, guide, technical, coding, programming
- Tags: code, software, technology, dev
- Structure
:Polylang placeholder do not modify
B. Narrative content (📖 1200 words/break)
- Categories: short story, history, narrative, literature
- Tags: short story, novel, fiction, poetry
- Structure
:Polylang placeholder do not modify
C. Journalistic content (📰 900 words/break)
- Categories: news, current affairs, report, article
- Tags: press, interview, opinion
- Structure
:Polylang placeholder do not modify
D. General Content (✍️ 1000 words/break)
- Applies when it does not coincide with the above criteria.
- Default value for standard posts
The code evaluates the items in the following order:
- Post categories
- Post tags
- Structure of the content
- Assign "General" if there is no match
The maximum number of breaks suggested is 5 per post.
Aspect
The CSS is included in the code for convenience and returns this look to the tabs. The information it displays is quite explicit.
You can modify it to link the title to the edit page if you find it more convenient for quick access, or make any other improvements you can think of.
Code
// ======================================================================
//Displays a paginated list of long posts (ordered from most to least number of words) with suggestions for splitting them into pages according to content type
// Shortcode for Page Breaks analysis on long posts. Courtesy of /jrmora.com
// ======================================================================
function analizador_pagebreaks_completo() {
// 1. Protección contra ejecución en admin/AJAX
if (is_admin() || wp_doing_ajax() || (defined('REST_REQUEST') && REST_REQUEST)) {
return '';
}
// 2. Configuración
$config = [
'umbral_palabras' => 1500,
'posts_por_pagina' => 10,
'mostrar_todos_idiomas' => true
];
// 3. Detección de tipo de contenido
function detectar_tipo_contenido($post_id) {
$post = get_post($post_id);
$content = strip_tags($post->post_content);
// Por categorías
$categorias = wp_get_post_categories($post_id, ['fields' => 'slugs']);
if (array_intersect($categorias, ['tutorial', 'guia', 'tecnico', 'code', 'programacion'])) {
return ['tipo' => 'técnico', 'color' => '#e3f2fd', 'icono' => '⚙️'];
} elseif (array_intersect($categorias, ['relato', 'historia', 'narrativa', 'literatura'])) {
return ['tipo' => 'narrativo', 'color' => '#f3e5f5', 'icono' => '📖'];
} elseif (array_intersect($categorias, ['noticia', 'actualidad', 'reportaje', 'articulo'])) {
return ['tipo' => 'periodístico', 'color' => '#e8f5e9', 'icono' => '📰'];
}
// Por etiquetas
$etiquetas = wp_get_post_tags($post_id, ['fields' => 'slugs']);
if (array_intersect($etiquetas, ['codigo', 'software', 'tecnologia', 'dev'])) {
return ['tipo' => 'técnico', 'color' => '#e3f2fd', 'icono' => '⚙️'];
} elseif (array_intersect($etiquetas, ['cuento', 'novela', 'ficcion', 'poesia'])) {
return ['tipo' => 'narrativo', 'color' => '#f3e5f5', 'icono' => '📖'];
}
// Por estructura
if (preg_match_all('/```|<pre>|<code>/i', $content) > 2) {
return ['tipo' => 'técnico', 'color' => '#e3f2fd', 'icono' => '⚙️'];
} elseif (preg_match_all('/\n—|\n"|\n\d+\. /i', $content) > 3) {
return ['tipo' => 'narrativo', 'color' => '#f3e5f5', 'icono' => '📖'];
} elseif (preg_match_all('/\n### |\n#### |\*Nota:/i', $content) > 3) {
return ['tipo' => 'periodístico', 'color' => '#e8f5e9', 'icono' => '📰'];
}
return ['tipo' => 'general', 'color' => '#f5f5f5', 'icono' => '✍️'];
}
// 4. Cálculo de breaks por tipo
function calcular_breaks_por_tipo($palabras, $tipo) {
$config = [
'técnico' => 700,
'narrativo' => 1200,
'periodístico' => 900,
'general' => 1000
];
$breaks = max(1, floor($palabras / ($config[$tipo] ?? 1000)) - 1);
return min($breaks, 5); // Máximo 5 breaks
}
// 5. Función para calcular distribución de palabras entre breaks
function calcular_distribucion_palabras($content) {
$total_palabras = str_word_count(strip_tags($content));
$breaks = substr_count($content, '<!--nextpage-->');
if ($breaks === 0) {
return [];
}
// Dividir el contenido por los breaks
$secciones = explode('<!--nextpage-->', $content);
$distribucion = [];
foreach ($secciones as $seccion) {
$palabras_seccion = str_word_count(strip_tags($seccion));
$distribucion[] = $palabras_seccion;
}
return $distribucion;
}
// 6. Función para obtener todos los IDs de posts a procesar
function obtener_ids_posts_a_procesar($mostrar_todos_idiomas) {
$args = [
'post_type' => 'post',
'posts_per_page' => -1,
'fields' => 'ids',
'post_status' => 'publish'
];
// Si Polylang está activo y queremos mostrar todos los idiomas
if ($mostrar_todos_idiomas && function_exists('pll_get_post_translations')) {
$query = new WP_Query($args);
$all_post_ids = [];
foreach ($query->posts as $post_id) {
$translations = pll_get_post_translations($post_id);
$all_post_ids = array_merge($all_post_ids, array_values($translations));
}
return array_unique($all_post_ids);
}
// Caso normal (sin Polylang o no mostrar todos los idiomas)
$query = new WP_Query($args);
return $query->posts;
}
// 7. Obtener y procesar posts
$posts_procesados = [];
$post_ids = obtener_ids_posts_a_procesar($config['mostrar_todos_idiomas']);
foreach ($post_ids as $post_id) {
procesar_post_con_tipo($post_id, $config['umbral_palabras'], $posts_procesados);
}
// 8. Ordenar por palabras (mayor a menor)
usort($posts_procesados, function($a, $b) {
return $b['palabras'] - $a['palabras'];
});
// 9. Paginación manual
$paged = max(1, get_query_var('paged'));
$total_posts = count($posts_procesados);
$total_paginas = ceil($total_posts / $config['posts_por_pagina']);
$offset = ($paged - 1) * $config['posts_por_pagina'];
$posts_paginados = array_slice($posts_procesados, $offset, $config['posts_por_pagina']);
// 10. CSS con mejoras visuales
$output = '<style>
.pb-ultimate-container {
max-width: 800px;
margin: 0 auto;
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
}
.pb-ultimate-item {
background: #fff;
border-radius: 8px;
padding: 20px;
margin-bottom: 20px;
box-shadow: 0 1px 3px rgba(0,0,0,0.08);
border: 1px solid #e0e0e0;
}
.pb-ultimate-title {
font-size: 1.25rem;
margin: 0 0 10px 0;
line-height: 1.4;
}
.pb-ultimate-meta {
display: flex;
flex-wrap: wrap;
gap: 12px;
margin-bottom: 12px;
font-size: 0.85rem;
}
.pb-ultimate-meta-item {
display: inline-flex;
align-items: center;
gap: 4px;
}
.pb-ultimate-type {
padding: 2px 10px;
border-radius: 10px;
font-size: 0.75rem;
font-weight: 500;
}
.pb-ultimate-suggestion {
background: #f5f5f5;
border-left: 3px solid #64b5f6;
padding: 10px 15px;
margin-top: 15px;
border-radius: 0 4px 4px 0;
font-size: 0.9rem;
}
.pb-ultimate-pagination {
display: flex;
justify-content: center;
margin: 30px 0;
flex-wrap: wrap;
gap: 8px;
}
.pb-ultimate-page {
padding: 8px 16px;
border-radius: 4px;
text-decoration: none;
}
.pb-ultimate-page-number {
border: 1px solid #e0e0e0;
color: #1976d2;
}
.pb-ultimate-page-number:hover {
background: #e3f2fd;
}
.pb-ultimate-page-current {
background: #1976d2;
color: white;
border: 1px solid #1976d2;
}
.pb-ultimate-wordcount {
font-weight: 600;
color: #212121;
}
.pb-ultimate-language {
background: #e3f2fd;
color: #1565c0;
padding: 2px 10px;
border-radius: 10px;
font-size: 0.75rem;
}
.pb-ultimate-warning {
color: #d32f2f;
font-weight: 500;
}
.pb-ultimate-distribution {
margin-left: 5px;
font-size: 0.95em; /* Aumentado de 0.8em a 0.95em */
font-weight: 600; /* Negrita añadida */
color: #444; /* Color más oscuro para mejor contraste */
background: #f8f8f8; /* Fondo sutil */
padding: 2px 6px;
border-radius: 4px;
border-left: 2px solid #64b5f6; /* Borde izquierdo azul */
}
.pb-ultimate-distribution-separator {
color: #999;
font-weight: normal;
margin: 0 3px;
}
</style>';
$output .= '<div class="pb-ultimate-container">';
if (!empty($posts_procesados)) {
foreach ($posts_paginados as $post) {
$breaks_sugeridos = calcular_breaks_por_tipo($post['palabras'], $post['tipo']['tipo']);
$palabras_por_seccion = ceil($post['palabras'] / ($breaks_sugeridos + 1));
$output .= '<div class="pb-ultimate-item">';
$output .= '<h3 class="pb-ultimate-title"><a href="' . esc_url($post['enlace']) . '">' . esc_html($post['titulo']) . '</a></h3>';
$output .= '<div class="pb-ultimate-meta">';
$output .= '<span class="pb-ultimate-meta-item pb-ultimate-wordcount">📝 ' . number_format($post['palabras']) . ' palabras</span>';
$output .= '<span class="pb-ultimate-meta-item pb-ultimate-type" style="background: ' . $post['tipo']['color'] . '">';
$output .= $post['tipo']['icono'] . ' ' . ucfirst($post['tipo']['tipo']);
$output .= '</span>';
$output .= '<span class="pb-ultimate-meta-item pb-ultimate-language">' . strtoupper($post['idioma']) . '</span>';
if ($post['tiene_break']) {
$distribucion = calcular_distribucion_palabras($post['contenido']);
$distribucion_text = '';
if (!empty($distribucion)) {
$distribucion_text = '<span class="pb-ultimate-distribution">(';
$distribucion_parts = [];
foreach ($distribucion as $index => $palabras) {
$distribucion_parts[] = ($index + 1) . 'º: ' . number_format($palabras);
}
$distribucion_text .= implode('<span class="pb-ultimate-distribution-separator"> · </span>', $distribucion_parts) . ')</span>';
}
$output .= '<span class="pb-ultimate-meta-item">✅ ' . $post['breaks_actuales'] . ' breaks ';
$output .= $distribucion_text;
$output .= '</span>';
} else {
$output .= '<span class="pb-ultimate-meta-item pb-ultimate-warning">⚠️ SIN PAGINAR</span>';
}
$output .= '</div>';
$output .= '<div class="pb-ultimate-suggestion">';
$output .= '<strong>Sugerencia:</strong> ' . $breaks_sugeridos . ' breaks (1 cada ~' . number_format($palabras_por_seccion) . ' palabras)';
$output .= '</div>';
$output .= '</div>';
}
// Paginación
if ($total_paginas > 1) {
$output .= '<div class="pb-ultimate-pagination">';
$output .= paginate_links([
'base' => str_replace(999999999, '%#%', esc_url(get_pagenum_link(999999999))),
'format' => '?paged=%#%',
'current' => $paged,
'total' => $total_paginas,
'prev_text' => __('«'),
'next_text' => __('»'),
'mid_size' => 1
]);
$output .= '</div>';
}
} else {
$output .= '<div class="pb-ultimate-item">No se encontraron posts que requieran paginación.</div>';
}
return $output . '</div>';
}
// Función auxiliar: Procesar post con tipo
function procesar_post_con_tipo($post_id, $umbral, &$posts) {
$post = get_post($post_id);
if (!$post) return;
$content = $post->post_content;
$word_count = str_word_count(strip_tags($content));
if ($word_count >= $umbral) {
$tipo = detectar_tipo_contenido($post_id);
$idioma = 'es'; // Valor por defecto
// Detectar idioma si Polylang está activo
if (function_exists('pll_get_post_language')) {
$idioma = pll_get_post_language($post_id) ?: 'es';
}
$posts[] = [
'ID' => $post_id,
'titulo' => $post->post_title,
'palabras' => $word_count,
'contenido' => $content,
'enlace' => get_permalink($post_id),
'tiene_break' => (strpos($content, '<!--nextpage-->') !== false),
'breaks_actuales' => substr_count($content, '<!--nextpage-->'),
'idioma' => $idioma,
'tipo' => $tipo
];
}
}
add_shortcode('analizador_pagebreaks_ultimate', 'analizador_pagebreaks_completo');