Follow

# How to Clean Data in Excel -- Selecting only certain parts of the content

The LEFT/RIGHT/MID formulas allow you to select a cell from which you would only like certain parts of the content. One example could be that zip codes have been entered with 9 digits, e.g. 12345-6789, but you only need the first five digits. To make this conversion, insert a column next to the zip code column and use the LEFT formula to extract the first five digits.

Excel formula:
=LEFT(text, [num_chars])

‘Text’ refers to the cell you want to convert and [num_chars] is the number of characters you want to pull from the cell. The formula starts from the left of the selected cell and pulls the number of characters you want, in this case the number of characters is 5, as you only want the first five digits.

Turn your 9 digit zipcode entry into a 5 digit zip code with the LEFT formula.

Once you have entered the formula for all cells, it's important you turn your formulas into plain text. Highlight the cells that contain the LEFT formula, copy the cells (CONTROL+C or COMMAND+C on Mac computers), go to Edit -> Paste Special. Now, select 'Values' and press 'OK'. This will paste the cells as plain text so the formula no longer appears in the cells. This allows you to move cells around without having to worry about the formulas being wrong.

The RIGHT formula works the same way as the LEFT formula, except it starts from the right and counts backwards. Everything else is the same. The MID formula on the other hand is slightly different. MID is short for Middle, which means you can capture parts of a word, starting at any point in a cell.

As an example, say you have a phone number with a country code, such as this one: 001 (206) 123 4567, but you are only interested in capturing the area codes. i.e. the '206' part. The MID formula can help you extract this part.

Excel formula:
=MID(text,start_num,num_chars)

=MID([select a cell],[where you want to start counting from],[the number of characters you want to capture])

In the example of the phone 001 (206) 123 4567 where you want to capture the area code 206, you want to start at the number 2 after the first parenthesis and capture 3 characters. All characters and spaces are allocated a number starting with 1. In this case the first 0 is character number 1, the next 0 is character number 2, and so on. If you count 001 (206) 123 4567, the area code starts with character number 6 (notice, spaces also count as a character). The formula will therefore become:

Excel formula:
=MID(A1,6,3).