Adam S. Cochran Adam S. Cochran - 6 days ago 8
SQL Question

Split MySQL fields by character count to the nearest full word?

I have a MySQL field for the body texts of blog posts in markdown format. Due to an API I'm using I can only send 3000 character blocks, however some of my posts are as large as 4500 characters and there are over 2000 of them so I don't want to manually split them out.

I'm trying to figure out a function to check the char_length of each field in the column and if it is over 3000 characters the function would split out anything beyond 3000 characters (rounded to the nearest word) into a second column I have. It's beyond the scope of functions I've dealt with before and so I'm hoping for a push in the right direction. Here is the base of what I have so far:

SELECT `Body` from `blogposts`
WHERE char_length(Body) > 3000
SET
Body2 = SUBSTRING(`Body`, 3001, char_length(Body))
Body = SUBSTRING(`Body`, 1, 3000)


Since it isn't yet complete I've not tested it yet. I'm not convinced it would come close to doing what I want but the two other problems I'm still trying to solve before I test are:

1) How to have it go to the end of the most recent word (rounded down under 3000 characters) rather than split at exactly the 3000th character.

2) If it is trying to deal in words will it break on markdown/html that is in the text such as splitting
<div>
into
<div" ">
if that's the 3000th character.

For background I've read through the following:

Split string into table in groups of 26 characters or less, rounded to the nearest word

The responses seem to have come up with custom functions to split the string based on a set length, although the functions aren't well explained/commented so I'm a bit lost.

If it isn't easy to do in MySQL I am open to pulling this out in PHP and manipulating the data there.

Any insights would be appreciated!

Answer
update   `blogposts`

set     `Body2` = substring(`Body`,3000-instr(reverse(left(`Body`,3000)),' ')+1) 
       ,`Body` = left(`Body`,3000-instr(reverse(left(`Body`,3000)),' '))  

where   char_length(Body) > 3000
;

Demo on 30 characters

set @Body = 'My name is Inigo Montoya! You''ve killed my father, prepare to die!';

select  left(@Body,30-instr(reverse(left(@Body,30)),' '))         as field_1
       ,substring(@Body,30-instr(reverse(left(@Body,30)),' ')+1)  as field_2
;       

+---------------------------+------------------------------------------+
| field_1                   | field_2                                  |
+---------------------------+------------------------------------------+
| My name is Inigo Montoya! | You've killed my father, prepare to die! |
+---------------------------+------------------------------------------+

Full Example

create table `blogposts` (`Body` varchar(3000),`Body2`  varchar(3000));

insert into blogposts (`Body`) values

 ('Hello darkness, my old friend'                          )
,('I''ve come to talk with you again'                      )
,('Because a vision softly creeping'                       )
,('Left its seeds while I was sleeping'                    )
,('And the vision that was planted in my brain'            )
,('Still remains'                                          )
,('Within the sound of silence'                            )
,('In restless dreams I walked alone'                      )
,('Narrow streets of cobblestone'                          )
,('''Neath the halo of a street lamp'                      )
,('I turned my collar to the cold and damp'                )
,('When my eyes were stabbed by the flash of a neon light' )
,('That split the night'                                   )
,('And touched the sound of silence'                       )
,('And in the naked light I saw'                           )
,('Ten thousand people, maybe more'                        )
,('People talking without speaking'                        )
,('People hearing without listening'                       )
,('People writing songs that voices never share'           )
,('And no one dared'                                       )
,('Disturb the sound of silence'                           )
;

select  left(`Body`,30-instr(reverse(left(`Body`,30)),' '))         as Body
       ,substring(`Body`,30-instr(reverse(left(`Body`,30)),' ')+1)  as Body2

from    `blogposts`

where   char_length(Body) > 30
;

+------------------------------+---------------------------+
| Body                         | Body2                     |
+------------------------------+---------------------------+
| I've come to talk with you   | again                     |
+------------------------------+---------------------------+
| Because a vision softly      | creeping                  |
+------------------------------+---------------------------+
| Left its seeds while I was   | sleeping                  |
+------------------------------+---------------------------+
| And the vision that was      | planted in my brain       |
+------------------------------+---------------------------+
| In restless dreams I walked  | alone                     |
+------------------------------+---------------------------+
| 'Neath the halo of a street  | lamp                      |
+------------------------------+---------------------------+
| I turned my collar to the    | cold and damp             |
+------------------------------+---------------------------+
| When my eyes were stabbed by | the flash of a neon light |
+------------------------------+---------------------------+
| And touched the sound of     | silence                   |
+------------------------------+---------------------------+
| Ten thousand people, maybe   | more                      |
+------------------------------+---------------------------+
| People talking without       | speaking                  |
+------------------------------+---------------------------+
| People hearing without       | listening                 |
+------------------------------+---------------------------+
| People writing songs that    | voices never share        |
+------------------------------+---------------------------+

update  `blogposts`

set     `Body2` = substring(`Body`,30-instr(reverse(left(`Body`,30)),' ')+1) 
       ,`Body`  = left(`Body`,30-instr(reverse(left(`Body`,30)),' '))        

where   char_length(`Body`) > 30
;

select  `Body`
       ,`Body2`

from    `blogposts`

where   `Body2` is not null
;

+------------------------------+---------------------------+
| Body                         | Body2                     |
+------------------------------+---------------------------+
| I've come to talk with you   | again                     |
+------------------------------+---------------------------+
| Because a vision softly      | creeping                  |
+------------------------------+---------------------------+
| Left its seeds while I was   | sleeping                  |
+------------------------------+---------------------------+
| And the vision that was      | planted in my brain       |
+------------------------------+---------------------------+
| In restless dreams I walked  | alone                     |
+------------------------------+---------------------------+
| 'Neath the halo of a street  | lamp                      |
+------------------------------+---------------------------+
| I turned my collar to the    | cold and damp             |
+------------------------------+---------------------------+
| When my eyes were stabbed by | the flash of a neon light |
+------------------------------+---------------------------+
| And touched the sound of     | silence                   |
+------------------------------+---------------------------+
| Ten thousand people, maybe   | more                      |
+------------------------------+---------------------------+
| People talking without       | speaking                  |
+------------------------------+---------------------------+
| People hearing without       | listening                 |
+------------------------------+---------------------------+
| People writing songs that    | voices never share        |
+------------------------------+---------------------------+
Comments