Mikey Mikey - 5 months ago 10
PHP Question

Should I be using htmlspecialchars?

I seem to have trouble understanding when to use htmlspecialchars().

Let's say I do the following when I am inserting data:

$_POST = filter_input_array(INPUT_POST, [
'homepage' => FILTER_DEFAULT // do nothing

$course = new Course();
$course->name = trim($_POST['name']);
$course->homepage = $_POST['homepage']; // may contain unsafe HTML

$courseDAO = DAOFactory::getCourseDAO();
$courseDAO->addCourse($course); // simple insert statement

When I ouput, I do the following:

$courseDAO = DAOFactory::getCourseDAO();
$course = $courseDAO->getCourseById($_GET['id']);

<?php ob_start() ?>

<h1><?= $course->name ?></h1>
<div class="homepage"><?= $course->homepage ?></div>

<?php $content = ob_get_clean() ?>

<?php include 'layout.php' ?>

I would like that
be treated and rendered as HTML by the browser.

I've been reading answers on this question. Should I be using
anywhere here?


There are three types of data that you might output into HTML:

  • Text
  • Trusted HTML
  • Untrusted HTML

If it is text, then use htmlspecialchars to convert it to HTML.

If it is trusted HTML, then just output it.

If it is untrusted HTML then you need to sanitise it to make it same. That generally means parsing it with a DOM parser, and then removing all elements and attributes that do not appear on a whitelist as safe (some attributes may be special cased to be filtered rather than stripped), and then converting the DOM back to HTML. Tools like HTML Purifier exist to do this.

$course->homepage = $_POST['homepage']; // may contain unsafe HTML

I would like that $course->homepage be treated and rendered as HTML by the browser.

Then you have the third case and need to filter the HTML.