Skip to main content

Break a string without losing html tags in PHP


<?php
/**
* Using this function, we can break(substr) a string without losing html tags,
*
* @param $text
*    String which is to be shortened.
*
* @param $length
*    The length of the string .
*
* @param $ending
*    The string that is to be appended after shortening.  Defaults to &hellip;
*
* @param boolean $exact
*   If false, $text will not be cut mid-word
*
* @param boolean $considerHtml
*   If true, HTML tags would be handled correctly
*
*
* @return
*     string Trimmed string..
*/
Function _html_substr ($text, $length = 100, $ending = '...', $exact = false, $considerHtml = true) {
    if ($considerHtml) {
      // If the plain text is shorter than the maximum length, return the whole text
      if (strlen(preg_replace('/<.*?>/', '', $text)) <= $length) {
        return $text;
      }
      // splits all html-tags to scanable lines
      preg_match_all('/(<.+?>)?([^<>]*)/s', $text, $lines, PREG_SET_ORDER);
      $total_length = strlen($ending);
      $open_tags = array();
      $truncate = '';
      foreach ($lines as $line_matchings) {
        // if there is any html-tag in this line, handle it and add it (uncounted) to the output
        if (!empty($line_matchings[1])) {
          // if it's an "empty element" with or without xhtml-conform closing slash (f.e. <br/>)
          if (preg_match('/^<(\s*.+?\/\s*|\s*(img|br|input|hr|area|base|basefont|col|frame|isindex|link|meta|param)(\s.+?)?)>$/is', $line_matchings[1])) {
            // do nothing
            // if tag is a closing tag (f.e. </b>)
          }
                                  elseif (preg_match('/^<\s*\/([^\s]+?)\s*>$/s', $line_matchings[1], $tag_matchings)) {
            // delete tag from $open_tags list
            $pos = array_search($tag_matchings[1], $open_tags);
            if ($pos !== FALSE) {
              unset($open_tags[$pos]);
            }
              // if tag is an opening tag (f.e. <b>)
          }
                                  elseif (preg_match('/^<\s*([^\s>!]+).*?>$/s', $line_matchings[1], $tag_matchings)) {
            // add tag to the beginning of $open_tags list
            array_unshift($open_tags, strtolower($tag_matchings[1]));
          }
            // add html-tag to $truncate'd text
            $truncate .= $line_matchings[1];
        }
        // calculate the length of the plain text part of the line; handle entities as one character
        $content_length = strlen(preg_replace('/&[0-9a-z]{2,8};|&#[0-9]{1,7};|&#x[0-9a-f]{1,6};/i', ' ', $line_matchings[2]));
        if ($total_length+$content_length> $length) {
          // the number of characters which are left
          $left = $length - $total_length;
          $entities_length = 0;
          // search for html entities
          if (preg_match_all('/&[0-9a-z]{2,8};|&#[0-9]{1,7};|&#x[0-9a-f]{1,6};/i', $line_matchings[2], $entities, PREG_OFFSET_CAPTURE)) {
          // calculate the real length of all entities in the legal range
            foreach ($entities[0] as $entity) {
              if ($entity[1]+1-$entities_length <= $left) {
                $left--;
                                                    $entities_length += strlen($entity[0]);
              }
                                                  else {
                // no more characters left
                break;
              }
            }
          }
          $truncate .= substr($line_matchings[2], 0, $left+$entities_length);
          // maximum lenght is reached, so get off the loop
          break;
        }
                    else {
          $truncate .= $line_matchings[2];
          $total_length += $content_length;
        }
        // if the maximum length is reached, get off the loop
        if ($total_length>= $length) {
          break;
        }
      }
    }
                else {
      if (strlen($text) <= $length) {
        return $text;
      }
                  else {
        $truncate = substr($text, 0, $length - strlen($ending));
      }
    }
    // if the words shouldn't be cut in the middle...
    if (!$exact) {
      // ...search the last occurance of a space..
                  $spacepos = strrpos($truncate, ' ');
      if (isset($spacepos)) {
        // ...and cut the text in this position
        $truncate = substr($truncate, 0, $spacepos);
      }
    }
    // add the defined ending to the text
    $truncate .= $ending;
    if ($considerHtml) {
      // close all unclosed html-tags
      foreach ($open_tags as $tag) {
        $truncate .= '</' . $tag . '>';
      }
    }
    return $truncate;
}
?>

Comments

Popular posts from this blog

Difference between session.gc_maxlifetime and session.cookie_lifetime in Drupal setting.php

ini_set('session.gc_maxlifetime', 200000); This value is for the server. It is a settings for Session Garbage Collection. If the users last visit happened before 200000s then this session is eligible for garbage collection. Since it is GC, the session value may be discarded and not compulsory. If a GC action happens after the session was made eligible for the GC, it will be deleted. ini_set ( 'session.cookie_lifetime' , 2000000 ); This value is for the browser. This is the absolute maximum time till which a browser can keep this cookie active. A 0 value here means immediate or when the browser is closed. Source: 

Smart pagination or page break in Drupal 7(CK editor)

1. Install Smart Paging module   Go to Administration › Configuration › Administer Smart Paging settings.   Under 'Default page break method', select "Manual placement of page break placeholder". 2.  Install  'Ckeditor' Module   Go to Administration › Configuration > Ckediotr profiles > Filtered HTML   Edit the settings of the Advanced (Filtered HTML) Profile. Under 'Editor Appearance' section, In plugins check the required options like " Plugin for inserting a Drupal teaser and page breaks. ". 3.  Edit the configuration settings of input formats (Filtered HTML, Full HTML, Plain Text)   Go to Administration › Configuration > Text formats. Edit the required input format. For example say "Filtered HTML". Under  "Enabled filters" section, check the 'Smart Paging' option and uncheck all the remaining checkboxes. 4. Go to Content type 'article' and create new content. We will s...

Smart pagination or page break in Drupal(wysiwyg)

1)Install Smart Paging module   Go to Administration › Configuration › Administer Smart Paging settings.   Under 'Default page break method', select "Manual placement of page break placeholder". 2)Install "wysiwyg" module and 'ckeditor' profile   Go to Administration › Configuration > Wysiwyg profiles > Filtered HTML   Edit the settings of Filtered HTML Profile. Under 'Buttons and plugins' section, check the required options like "Smart Paging ","Image","Bold". 3)Edit the configuration settings of input formats(Filtered HTML, Full HTML, Plain Text)   Go to Administration › Configuration > Text formats. Edit the required input format. For example say "Filtered HTML". Under  "Enabled filters" section, check the 'Smart Paging' option and uncheck all the remaining checkboxes. 4)Go to Content type 'article' and create new content. We will see the wy...